rsagroup / rsatoolbox

Python library for Representational Similarity Analysis
MIT License
179 stars 38 forks source link

Virtualization via Docker/Singularity #178

Open PeerHerholz opened 3 years ago

PeerHerholz commented 3 years ago

Ahoi hoi everyone,

very happy to see the new developments in python and extended support! I'm very much looking forward to trying it out.

I was wondering about your thoughts concerning virtualization through containers, i.e. Docker and Singularity (maybe not now, but once you're closer to the release(s)). Depending on how one does approach these things, there are several advantages including testing, automatization and documentation/tutorials.

Concerning the first, all tests could be run in a dedicated environment, i.e. the container. Adding that to the CI would also allow to set up workflows for testing the build of the container and pushing it to a registry (e.g. Dockerhub), basically the whole package. Here are some pointers on how travis handles these things.
WRT automatization: I saw that reading in BIDS derivatives is/will be supported which is fantastic and together with the support of containers would allow introducing a respective BIDS App. Here the general idea would be that users provide a BIDS derivatives dataset as input to the container together with a set of commands that specify what should be run how (e.g. searchlight, metric and models). The container would then run the specified workflow and store the result(s) in a new BIDS derivatives directory, e.g. derivatives/pyrsa/sub-001. The flexibility of input options and thus workflows is of course something to think about carefully, but this would go a long way towards increasing standardization and reproducibility, as well as help with running analyses on large-scale datasets via HPCs. If containers are supported/provided, the documentation and tutorials could be enhanced by providing the opportunity to run and explore the demos in an interactive online instance such as binder.

I did a first pass at the containers over in the virtualization branch of my fork. The idea so far would be to have s small bash script, right now called generate_pyrsa_images.sh, that runs neurodocker with a set of arguments to create image build files, either for Docker or Singularity. After running that in its current form, you'll have both and can build them using whatever way you prefer. Right now, it's very limited and only creates a small image that runs ubuntu 20.04 and includes a conda env with pyrsa installed. No specific entrypoint or functionality was specified so far. However, doing that and changing/adding things to the images is straightforward via the bash script. I tried the exercise all demo in a binder instance and it seems to work (it might take a bit to start):

Binder

If you have any questions please don't hesitate to ask. It would be great to hear from you.

Stay safe. Cheers, Peer

HeikoSchuett commented 3 years ago

Hi @PeerHerholz!

apologies for taking so long!

I think we do plan to extend our integration with the neuroscience pipelines in python, which should ultimately include methods for putting pyrsa into containers. For that your pointers are definitely helpful. Also it is encouraging that you apparently found it quite easy to generate at least some rudimentary containers which contain pyrsa.

Before we spent much time on putting the pipelines into containers I think we need to implement some more command line interface and define the input options and workflows. Once we have some part of the code which can be run from the command line on some BIDS folder(s) to run sensible analyses, I think this will be as next step. Without this command line structure it would of course still be possible to put pyrsa into your own docker image, but without these workflows it would not be much help to have any standardized containers I guess.

If you already see some interesting use cases for pyrsa to be put into containers right now, it would of course be welcome if you tried it out! Also in general it seems that you are into this container stuff. So if you would like to contribute to that part you are welcome to do so! Just start a pull request with the additions pyrsa would need and we will have a detailed look.

The option to put the demos onto a server such that users could try them out without running anything locally is an interesting option. I will talk to some colleagues whether and how we want to do this. Stay tuned

best Heiko

JasperVanDenBosch commented 3 years ago

+1 for a pyrsa Bids App

PeerHerholz commented 3 years ago

Ahoi hoi @HeikoSchuett,

thanks for the response and the detailed information, no biggie re time.

Yeah, you're absolutely right concerning workflows, etc., that was also what I was referring to/had in mind. Sorry for not communicating this more clearly. As there are currently no harsh/prominent software dependencies for pyrsa, running a potential analysis in a container could/would be an overkill, except for the reproducibility aspect of course.

Re BIDS App and workflows: depending on what you had in mind and how you want to set things up, the BIDS App template is a great starting point and helps to also keep Apps/workflows in a somewhat standardized way. The same holds true for the underlying implementation: most BIDS Apps have a "main" script (i.e. run.py) which basically is the CLI that parses input arguments to dedicated workflows (e.g. for pyrsa this could be RDM computation in certain ROIs, running a model-based searchlight, etc.). For the workflows, folks usually employ nipype as the respective workflow engine. However, there's also pydra now, basically nipype 2.0, which, if you decide on utilizing this workflow framework, should be the way forward.

Overall, once things are ready, putting everything in a container shouldn't be that much of a hassle as, again, there no harsh dependencies that would need to be included in the container (e.g. FreeSurfer, etc.). However, that should also be possible without major problems IMHO. At the current stage of development, the containers would mainly target the tests and the demos given the above-outlined advantages. As said, the demos already seem to work and changing the test workflows to utilize containers would also be possible. The changes I have made in my fork and dedicated branch are minimal, as it only needs the Dockerand/or Singularity build recipe. Besides those, I also added the neurodocker script that generates these files, so everything should be reproducible and straightforward (as mentioned above). If you want, I'm happy to push those changes. Going forward, those files can just be updated and adapted given the latest developments and changes of pyrsa.
On the other hand, if y'all want to wait till a potential release: no biggie as well, whatever works and I'm happy to help where I can.

Cheers, Peer

limbicode214 commented 3 years ago

+1 rsatoolbox BIDS app. Would be awesome