yjmantilla / sovabids

A python package for the automatic conversion of EEG datasets to the BIDS standard, with a focus on making the most out of metadata.
https://sovabids.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
9 stars 3 forks source link

Need help with the API 🆘 #25

Closed yjmantilla closed 2 years ago

yjmantilla commented 3 years ago

So a strong component component of the sovabids project is having a good API that serves possible third-party GUIs. Nevertheless I find myself struggling with how to best design such API. As of now, I just went along with what I had originally in mind at the proposal and what came to me as I coded.

The current main functions API (the ones more prone to be used by an end-user) are:

Convert Module

convert_them(mappings_input) Converts eeg files to bids according to the mapping given.
sovaconvert() Console script usage for conversion.

More detail here

Rules Module

get_info_frompath(path, rules) Parse information from a given path, given a set of rules.
loadrules(rules) Load rules if given a path, bypass if given a dict.
apply_rules_to_singlefile(f, rules, bids_root, write=False, preview=False) Apply rule to a single file.
apply_rules(source_path, bidsroot, rules, mapping_path='') Apply rules to all the accepted files in a source path.
sovapply() Console script usage for applying rules.

More detail here

An approach from the "ideal"

To force me to think more on the API I imagined the "ideal GUI" for sovabids, and tried to think about the API it would require from sovabids for it to function. Here is what I got:

sovabids_api

I would greatly appreciate help with this since I feel lost on how to do this. @stebo85 @aswinnarayanan @civier @DavidjWhite33 @TomEmotion

TODO:

aswinnarayanan commented 3 years ago

Hi @yjmantilla, this looks really good!! The API presented covers so much of the use-case while also being straight-forward. I could also imagine the GUI might also have a reset or clear button, but that shouldn't need the API, so no issues there.

Could we separate out apply_rules into

  1. filespaths = get_files(input_dir)
  2. mappings = apply_rules(filepaths)
civier commented 3 years ago

Hi @yjmantilla and all, Very impressive analysis!

I don't have time to get into the bolts and bits of it right now, but I would suggest that you look into some popular APIs (that have multiple GUIs) for inspiration.

The first example that came to my mind is the SSHFS protocol, which has multiple GUIs (some of them are listed in the table here: https://supercomputing.swin.edu.au/docs/1-getting_started/file-transfer.html). This protocol is not for data conversion, but it does involve moving data, etc., which has some commonalities with what we're doing. These are the basic API commands of SSHFS (taken from the help of the sftp utility). I'm sure you would be able to find more info online:

bye                                Quit sftp
cd path                            Change remote directory to 'path'
chgrp grp path                     Change group of file 'path' to 'grp'
chmod mode path                    Change permissions of file 'path' to 'mode'
chown own path                     Change owner of file 'path' to 'own'
df [-hi] [path]                    Display statistics for current directory or
                                   filesystem containing 'path'
exit                               Quit sftp
get [-afPpRr] remote [local]       Download file
reget [-fPpRr] remote [local]      Resume download file
reput [-fPpRr] [local] remote      Resume upload file
help                               Display this help text
lcd path                           Change local directory to 'path'
lls [ls-options [path]]            Display local directory listing
lmkdir path                        Create local directory
ln [-s] oldpath newpath            Link remote file (-s for symlink)
lpwd                               Print local working directory
ls [-1afhlnrSt] [path]             Display remote directory listing
lumask umask                       Set local umask to 'umask'
mkdir path                         Create remote directory
progress                           Toggle display of progress meter
put [-afPpRr] local [remote]       Upload file
pwd                                Display remote working directory
quit                               Quit sftp
rename oldpath newpath             Rename remote file
rm path                            Delete remote file
rmdir path                         Remove remote directory
symlink oldpath newpath            Symlink remote file
version                            Show SFTP version
!command                           Execute 'command' in local shell
!                                  Escape to local shell
?                                  Synonym for help

@stebo85 @aswinnarayanan @civier @DavidjWhite33 @TomEmotion Do you have ideas for additional API we can learn from?

yjmantilla commented 3 years ago

@aswinnarayanan

Could we separate out apply_rules into filespaths = get_files(input_dir) mappings = apply_rules(filepaths)

Yeap, I also thought about that but never came through that idea, so I agree with it.

yjmantilla commented 3 years ago

@civier

I would suggest that you look into some popular APIs (that have multiple GUIs) for inspiration.

Yeap I agree. Most direct example is bidscoin so the correspondence of that API to us is basically:

bidscoin sovabids
is_sourcefile(file: Path) -> str equivalent to filtering by extension, we don’t have an api explicetly for this though
get_attribute(dataformat: str, sourcefile: Path, attribute: str, options: dict) -> str equivalent to the "preview" option of apply_rules_to_single_file
bidsmapper_plugin(session: Path, bidsmap_new: dict, bidsmap_old: dict, template: dict, store: dict) -> None equivalent to apply_rules (which applies the rules to each file involved and outputs the mappings for each)
bidscoiner_plugin(session: Path, bidsmap: dict, bidsfolder: Path) -> None equivalent to "convert"

The SSHFS API

Here is what I thought of it in relation to sovabids

command description makes sense for sovabids?
bye Quit sftp nope (?)
cd path Change remote directory to 'path' nope (?)
chgrp grp path Change group of file 'path' to 'grp' nope (?)
chmod mode path Change permissions of file 'path' to 'mode' nope (?)
chown own path Change owner of file 'path' to 'own' nope (?)
df [-hi] [path] Display statistics for current directory or filesystem containing 'path' maybe (?)
exit Quit sftp nope (?)
get [-afPpRr] remote [local] Download file nope (?)
reget [-fPpRr] remote [local] Resume download file nope (?)
reput [-fPpRr] [local] remote Resume upload file maybe (?) (incremental conversion use-case)
help Display this help text implemented through argparse help (at least for the cli usage)
lcd path Change local directory to 'path' nope (?)
lls [ls-options [path]] Display local directory listing guess this is equivalent to the proposal of get_files
lmkdir path Create local directory nope (?)
ln [-s] oldpath newpath Link remote file (-s for symlink) nope (?)
lpwd Print local working directory nope (?)
ls [-1afhlnrSt] [path] Display remote directory listing nope (?)
lumask umask Set local umask to 'umask' nope (?)
mkdir path Create remote directory mmm , as of now if the output dir doesn’t exist it gets created automatically
progress Toggle display of progress meter hmmm, we could implement a cli progress bar although really low priority
put [-afPpRr] local [remote] Upload file may be equivalent to apply_conversion_to_single_file
pwd Display remote working directory nope (?)
quit Quit sftp nope (?)
rename oldpath newpath Rename remote file nope (?)
rm path Delete remote file nope (?)
rmdir path Remove remote directory nope (?)
symlink oldpath newpath Symlink remote file nope (?)
version Show SFTP version yeap
!command Execute 'command' in local shell nope (?)
! Escape to local shell nope (?)
? Synonym for help implemented through argparse help (at least for the cli usage)
civier commented 3 years ago

@yjmantilla Thanks for the detailed comparison to SSHFS. I didn't expect the protocols to be equivalent or anything like that, because they are used for different things (data conversion versus file transfer). I just wondered if we can learn something from SSHFS on the management of sessions / authentication / interrupting operations in the middle / feedback on progress / etc., and what makes it so easy to implement a GUI for it? Two things that interest me personally are feedback on progress and process interruption, as those things can be very frustrating for users (think how many times you waited for a GUI application to finish processing, and all you saw was a rotating hourglass + when you pressed ESC or Ctrl+C, nothing happened). If we can get these things right, we are already in the right direction.

yjmantilla commented 3 years ago

@civier

I just wondered if we can learn something from SSHFS on the management of sessions / authentication / interrupting operations in the middle / feedback on progress / etc., and what makes it so easy to implement a GUI for it?

Are the features of sessions / authentication somethings we want in sovabids? I think this can be best handled by the applications that uses sovabids. Im not experienced with this but it may be a rabbit hole.

The ideas of interrupting operations in the middle / feedback on progress though I think are good feature goals to have.

what makes it so easy to implement a GUI for it?

I think this would require analysis of sftp and/or one of its clients which wouldn't be an easy task. My intuition is that it comes from having an atomic architecture (I mean, having a way to do pure simple operations). I did a shallow read of the source of WinSCP but didn't get much hints to answer the question.

Two things that interest me personally are feedback on progress and process interruption, as those things can be very frustrating for users (think how many times you waited for a GUI application to finish processing, and all you saw was a rotating hourglass + when you pressed ESC or Ctrl+C, nothing happened). If we can get these things right, we are already in the right direction.

I think this is a fair functionality but I don't think it is trivial. The GUI would need to monitor the internal state of the conversion function. In python one way would be with a PIPE from the multiprocessing module.

Nevertheless, I think is also possible to do this by yielding instead of returning. I have seldom used generators in python but it may be an approach. Any time the conversion function finishes a file it yields something that is collected outside. The outer scope would know the total number of files, so it would know the percentage of completion.

Now, in the API this would mean having a parameter that controls whether we want to return or to yield.

In general what I take from this discussion is to have in mind:

Any thoughts on this ? @aswinnarayanan @stebo85

stebo85 commented 3 years ago

Dear @yjmantilla

I agree with your conlcusion: Keep it simple and extend when necessary. Progress feedback and control over the conversion process would be important.

Kind regards Steffen

aswinnarayanan commented 3 years ago

@yjmantilla

Thorough implementation of sigint might be a good place to start,

I think this is a fair functionality but I don't think it is trivial. The GUI would need to monitor the internal state of the conversion function. In python one way would be with a PIPE from the multiprocessing module.

There may be issues with terminating threads generated via python's multiprocessing. I've run into these myself. For e.g. https://www.semicolonworld.com/question/43691/keyboard-interrupts-with-python-39-s-multiprocessing-pool https://www.programmersought.com/article/33211279215/

These could offer some useful pointers: https://the-fonz.gitlab.io/posts/python-multiprocessing/ https://www.cloudcity.io/blog/2019/02/27/things-i-wish-they-told-me-about-multiprocessing-in-python/

yjmantilla commented 2 years ago

Im closing this for now since the main issue of how to design the API is resolved. A lot of more stuff can be added to it when one is thinking of the final GUI, but that is an ongoing discussion, not a single issue.

I want to save here the idea of @TomEmotion of the API having a function to contrast the difference between files.