spikodrome / administrative

An administrative area for orchestrating the project
Apache License 2.0
0 stars 0 forks source link

The App #4

Open yarikoptic opened 5 years ago

yarikoptic commented 5 years ago

Might be implemented as a part of Spike Forest or as dependent project here.

magland commented 4 years ago

I made some progress toward this goal here: https://github.com/magland/spikeforest2 And it is beginning to be usable. Meaning that you should already be able to run spike sorters (even the matlab ones) with just docker installed.

@yarikoptic at some point we should touch base again about how a user would ideally interact with the system.

yarikoptic commented 4 years ago

awesome. I keep hoping to give all your recent developments a try, so now should be the time! ;)

NB wrote the blurb before actually looking at https://github.com/magland/spikeforest2 examples, which seems already to be getting really close ;)

I guess I should first check what is current interface to recommend on a new one, but (with ) I think ultimately (so ATM could be without --sorters, just one at a time) it could be something like

docker run [bindmounts] THEIMAGE [-help] [--sorters SORTER1,SORTER2,...] [--output-format nwb|...] [--more-options] [-o output_path] input*.whatever 

with by default (no --sorters ... provided) running all the sorters.

Ignorant me not yet sure if there is a point in providing multiple inputs (e.g. multiple sessions from the same animal/channels), but I feel that it might be desired. output_path would get populated with results (may be subdirs per each input*/) using desired output-format (e.g. nwb) for storing spike sorted data and some .html (or whatever) with a nice summary overview of the results (agreement between sorters; accuracy if input*s had ground truth information provided -- so what you do on spikeforest already; etc).

If ground truth is not contained in the same input* files, and/or some additional information is lacking (geometry of electrodes) in input* since they aren't nwb or just because it was stored separately, --more-options should provide means to specify.

The whole docker image should have the ultimate runner as the entry point, so running with --help would provide instructions on the cmdline invocation etc.

ATM it seems to be different for a sample `magland/sf-ironclust` container I have tried -- yet to discover the desired way ```shell $> docker run magland/sf-ironclust --help container_linux.go:247: starting container process caused "exec: \"--help\": executable file not found in $PATH" docker: Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "exec: \"--help\": executable file not found in $PATH". ERRO[0067] error getting events from daemon: net/http: request canceled ```

Eventually I hope there could be some --curate BACKEND option which would allow to curate obtained spike trains and save them as part of the outputs whenever done (or may be to allow to run the same image in "curate only" mode). E.g. if curation app could be fired up in a browser (we could even bundle some https://guacamole.apache.org inside the container and expose via minimalistic web server in the docker and/or eventually singularity container, thus providing access to non-web based curation tool).

This is just ideas with my personal bias of a neuroimaging rotten soul here, where I am guided by BIDS-Apps approach (see introductory paper, or try running docker run poldracklab/mriqc --help . Sample mriqc report, per-subject report from fmriprep). Situation with BIDS-Apps a bit easier, since they all have clear idea on the input format at the level of a dataset to be a BIDS dataset, so it is sufficient to point to one directory as for the input. We have no such luxury here yet so interface might need to become a bit more "flexible" ;)

I hope this is of some value ;-)

yarikoptic commented 4 years ago

Ah, and eventually we would add --do-not-submit-to-spikeforest option ;) since one of the ideas would be to inform spike forest about relevant metadata (file format, species, cell_type, ...params for spike sorters, -- hence .nwb could be of great benefit for the input; params for spike sorters which would probably be somewhere in --more-options) + stats (agreement between sorters, accuracies) from running the sorters + curation, so we could benefit from the meta-results to eventually inform on best strategies (which spike sorter for what species/cell_type) etc and individual labs contributing to this collection of knowledge without sharing their precious data ;)

magland commented 4 years ago

Thanks @yarikoptic . That seems like good guidance, and definitely benefiting from your prior experience on these types of things. My strategy is to get a decent subset of the functionality implemented in the context of spikeforest in a way that should be adaptable to the type of turnkey pure-docker-deployed solution you are proposing.