Open satra opened 6 years ago
@satra - where are the instruction for the fsl&heudiconv container? I was thinking that building an image might be nice (and not only creating a Dockerfile), but I can "cheat" and use existing layers.
@djarecka - here you go:
# section 1
docker run --rm kaczmarj/neurodocker:master generate singularity \
--base neurodebian:latest --pkg-manager apt \
--install graphviz git wget \
--miniconda \
conda_install="python=3 pytest graphviz pip reprozip reprounzip \
requests rdflib fuzzywuzzy python-levenshtein pygithub pandas" \
pip_install="owlready2 pybids duecredit \
https://github.com/incf-nidash/PyNIDM/archive/a90b3f47dbdafb9504f13a3a8d85fdff931cc45c.zip" \
create_env="section1" \
activate=true \
--run-bash "cd /opt && \
git clone https://github.com/incf-nidash/PyNIDM.git" > Singularity
# section 2/3
docker run --rm kaczmarj/neurodocker:master generate singularity \
--base neurodebian:stretch-non-free --pkg-manager apt \
--install fsl-5.0-core fsl-mni152-templates \
--install make gcc sqlite3 libsqlite3-dev python3-dev \
libc6-dev python3-pip python3-setuptools python3-wheel \
--run "pip3 install --system reprozip reprounzip" \
--add-to-entrypoint "source /etc/fsl/5.0/fsl.sh" > Singularity
docker run --rm kaczmarj/neurodocker:master generate singularity \
--base neurodebian:latest --pkg-manager apt \
--install pigz python3-pip python3-traits python3-scipy \
python3-setuptools python3-wheel python3-networkx dcm2niix \
--install make gcc sqlite3 libsqlite3-dev python3-dev libc6-dev \
--run "pip3 install --system nipype \
https://github.com/mvdoc/dcmstack/archive/bf/importsys.zip \
https://github.com/nipy/heudiconv/archive/master.zip \
reprozip reprounzip" > Singularity
I would add one Docker image to compare and use in the container lesson, e.g. the second image, so the neurodocker command is:
docker run --rm kaczmarj/neurodocker:master generate docker \
--base neurodebian:stretch-non-free --pkg-manager apt \
--install fsl-5.0-core fsl-mni152-templates \
--install make gcc sqlite3 libsqlite3-dev python3-dev \
libc6-dev python3-pip python3-setuptools python3-wheel \
--run "pip3 install --system reprozip reprounzip" \
--add-to-entrypoint "source /etc/fsl/5.0/fsl.sh" > Dockerfile
And one T1w image would be great, e.g. ds000114/sub-01/ses-test/anat/sub-01_ses-test_T1w.nii.gz
, but could be really any image, just want to use as an example for bet
command
i will be cutting a new neurodocker release this week after i add more examples.
by the way, in general i recommend running neurodocker with docker without -i/--interactive
or -t/--tty
. i run it with docker run --rm kaczmarj/neurodocker:master ...
.
another minor point, pre-compiled reprozip wheels can be installed with pip
now (see https://github.com/ViDA-NYU/reprozip/issues/224).
@kaczmarj - it did not work day before yesterday when i tried. pip complained about compiling which is why a bunch of those additional dependencies were added.
@mjtravers, @yarikoptic - any chance you can take a look at this issue today? it would be good to cut a VM today or tomorrow if possible to have people play with it before we ask students to download.
@kaczmarj - i've updated the commands above without the -i
(@djarecka - it may be useful to add the utility of i
and t
for docker in additional slides.
I am running a VM build now that incorporates the section 1, 3, and 4 instructions above and from Al. Will send out a notice when it is posted for downloading and review.
I have posted an updated VM: https://training.repronim.org/repronim-training-v0.2.ova
There are 2 conda environments set up named: "section1" and "section4":
For section3, the kaczmarj/neurodocker:master image has been pulled into Docker and the following files are in the home directory:
Thanks @mjtravers. I could I misunderstood @satra, but I thought that we include singularity/docker images inside, so people don't spend time to build them
@mjtravers - I'll wait for the next version of VM and will test the new conda environments
The next version will be out later tonight. I discovered a couple of issues after I ran the build. The build is also taking a bit longer. I will message when ready
The updated VM is now available. I believe this version has everything for sections 1, 3, and 4. Section 2 is still in the works.
Download: https://training.repronim.org/repronim-training.ova The VM size is now ~10GB
@mjtravers - that seems large for what it contains. i'll see if i can download and check it out.
for conda are you clearing out the environments post install? example: https://github.com/kaczmarj/neurodocker/blob/master/examples/nipype_tutorial/Dockerfile#L117
also for any apt-get are you using eatmydata or some such? example: https://github.com/kaczmarj/neurodocker/blob/master/examples/nipype_tutorial/Dockerfile#L26
also is the vagrant file somewhere? it may be slightly easier for me to build it than download it on my flight :)
@satra No, I'm not doing any slimming of the file so I am sure there are a few GBs we could shave off. I am using packer and building off a baseline Ubuntu Desktop ova. I can post the files to this git repo and the baseline ova to the training.repronim.org. Let me set it up
don't know if that helps, but just checked the size of the environments:
(section4) vagrant@nitrcce:~$ du -hs /home/vagrant/miniconda2/envs/*
361M /home/vagrant/miniconda2/envs/section1
65M /home/vagrant/miniconda2/envs/section3
1.3G /home/vagrant/miniconda2/envs/section4
so together around 1.7
as soon as @jbpoline shares the notebooks, we can bring down the size of section 4. i'm sure all he needs is jupyter pandas seaborn scipy
(and their dependencies) and may be statsmodels
:)
let me know when the packer file is available. i'll try to build it on our cluster remotely. it's going to take the rest of my flight to download that ova!
you can also save about 200 MB by installing jupyter-notebook as notebook
instead of the entire jupyter
package. and if you need jupyterlab
, you can install that from conda-forge as jupyterlab
.
the jupyter
package installs big dependencies like qt5
, which probably are not necessary for the vm.
I've tested the section1 by running:
cd workspace/Indiv_Diffs_ReadingSkill/
~/PyNIDM/bin/BIDSMRI2NIDM.py -d ~/workspace/Indiv_Diffs_ReadingSkill
cd ~/nidm-training/
python rdf-age-query.py -nidm ~/workspace/Indiv_Diffs_ReadingSkill/nidm.ttl
and I got:
sub-12 - 2.096 - http://purl.org/nidash/nidm#_998c5d57-6a64-11e8-9b22-080027d6419f
sub-14 - 3.176 - http://purl.org/nidash/nidm#_998c5d5f-6a64-11e8-9b22-080027d6419f
sub-01 - 1.726 - http://purl.org/nidash/nidm#_998c5d2b-6a64-11e8-9b22-080027d6419f
sub-21 - -2.364 - http://purl.org/nidash/nidm#_998c5d7b-6a64-11e8-9b22-080027d6419f
...
So I didn't get any error, but I'm not sure if these is a proper output, should be "subject IDs, age of each subject, and the assessment ID" (the age looks pretty low to me for reading skills, but didn't read anything about the experiment)
For the section 4, I only tested a few things: importing pandas, numpy, opening jupyter notebook and lab. It seems to be working fine now.
I have a pull request in containing the VM build scripts.
I will take a look at those tutorials later today and see how much we can slim down the file.
There is a cleanup.sh script in place to add size-reducing code. Now that the code is out there, feel free to edit.
@djarecka: My last pull request for PyNIDM had some changes to BIDSMRI2NIDM but I didn't explicitly copy the tool from it's development location ([https://github.com/incf-nidash/PyNIDM/tree/master/nidm/experiment/tools]) to the bin folder so those copies may be out of date.
I'll take a look...
@dbkeator - so the output should be different?
Hi Dorota, Output from the query looks fine based on the data that is in the included participants.tsv. The students will be generating their own participant.tsv file as part of the hands on work…
On Jun 7, 2018, at 11:10, Dorota Jarecka notifications@github.com wrote:
@dbkeator https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dbkeator&d=DwMCaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=VoBnXVyFWlbGZNrES22a2h0PqI4m1kGEUyTwLwBBYIQ&m=ksyrz2bXpH97oHdYrah4jvmgHShUhBmtnAdNVbxwNks&s=geLIPdmrpZF6E_hZr_gT5oyaLOAX7r7Q4ukAKh_l_sY&e= - so the output should be different?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ReproNim_ohbm2018-2Dtraining_issues_15-23issuecomment-2D395514112&d=DwMCaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=VoBnXVyFWlbGZNrES22a2h0PqI4m1kGEUyTwLwBBYIQ&m=ksyrz2bXpH97oHdYrah4jvmgHShUhBmtnAdNVbxwNks&s=cyicK7OnMQcBp800WENXuxuyCOEz-YocroEcYvRafcg&e=, or mute the thread https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ABjyZRFUA7e0jx7PDVZctpFYboYq8PhYks5t6WyQgaJpZM4Uavfu&d=DwMCaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=VoBnXVyFWlbGZNrES22a2h0PqI4m1kGEUyTwLwBBYIQ&m=ksyrz2bXpH97oHdYrah4jvmgHShUhBmtnAdNVbxwNks&s=qC2U1wDwBPLNm0pbtt-LVjuwd52Bqfj-0zg334gOIOg&e=.
@djarecka @mjtravers Hi Folks, so I tried to replicate what Dorota did with the latest OVA:
cd workspace/Indiv_Diffs_ReadingSkill/
~/PyNIDM/bin/BIDSMRI2NIDM.py -d ~/workspace/Indiv_Diffs_ReadingSkill
cd ~/nidm-training/
python rdf-age-query.py -nidm ~/workspace/Indiv_Diffs_ReadingSkill/nidm.ttl
In the OVA, first, I received an error that PyNIDM wasn't installed. So I issued the following command: python ~/PyNIDM/setup.py install
Then I received an error that pybids wasn't installed. So, I issued the following command:
cd ~
git clone https://github.com/INCF/pybids.git
cd pybids
python setup.py install
Then, I received an error that urllib.parse doesn't have a module named quote. This was a curious error because urllib comes with python....which led me to the biggest problem: We have installed python2.7 via miniconda. PyNIDM was written for python 3.x and thus the current problem with the urllib package but likely many other downstream problems.
So, I couldn't test the things Dorota did.
@djarecka How did you test section1 with the current OVA file given the python 2.7 install?
Thanks!
@dbkeator - I did everything using the conda environment created by Matt, so first thing I did was source activate section1
. Sorry, should have included it in my post.
That is the environment that is purely for this part, so should have everything, and if not @mjtravers should know.
@jgrethe - thank you for the confirmation!
@djarecka Got it, that worked. Appears the query is also working. I didn't realize the ages were funky....
BTW one personally anything issue for me is that shortcuts on Win+number configured as shortcuts to various heavy applications such as libreoffice
i'm having some trouble importing the ova on our older virtualbox on our cluster. can someone verify the md5sum below?
$ md5sum repronim-training.ova
bad86aace872ed38c46f9e30c9d86d62 repronim-training.ova
(i can't test on osx as it will take me 4 days to download under my current connection)
@djarecka verified that the above md5sum is correct.
and in case others have the same import appliance issue on some flavor of linux/virtualbox combo, this link helps solve it: http://installfights.blogspot.com/2018/05/how-to-fix-virtualbox-error-when-you.html
@satra @mjtravers @kaczmarj
yes - absolutely - there is way to much stuff there - working on this right now and should have an update on what exactly is needed (Satra's list sounds right). @cmtgreenwood has uploaded two R scripts that we should test in the VM as well
@jbpoline - i'm not really R-user, so I might be doing something wrong, but I tried to open the script multiTesting.Rmd
in R studio and run, and it returns errors starting from there is no package called ‘knitr’
. I believe @mjtravers didn't have the list of required R packages.
@djarecka you are right : I think we need the R libraries library(knitr) library(rmarkdown) library(mvtnorm) library(ggplot2)
@cmtgreenwood do you confirm ? I suppose we can always extract the R code from the Rmd files which should run on the VM - but we do need mvtnorm and ggplot2, right ? @mjtravers : would be hard to include these R libraries in the VM ?
@jbpoline - my understanding was that r-studio (which is installed) can open Rmd and can run the specific script cells. This is what I tried and that's how I got the package errors.
hum - I am no R person but it looks like we need these R "libraries" (equivalent of python packages)
@jbpoline yes, i only tried to say that we don't need to "extract the R code from Rmd".
@djarecka @jbpoline I am able to load those R packages onto the system and have them added to the VM build process. Kinda sure I am loading them right.
I have a build going right now that will include the above R packages in r-base... plus the section 3 stuff for Yarik.
The build will be done and posted for download later this evening.
@satra .... plus I added the clean up for apt and conda referred to above. Will see if it reduces the VM size any with the in-progress build
@mjtravers - thank you - let's see what this does. it would be nice if packer did some kind of pre-post step assessment of size. may give us an indication of where things are piling up. my internal calculations indicate this VM should not exceed much more than 5-6G as an ova.
my wifi here is quite insufficient, so i'm trying to figure out how to run things remotely on our cluster.
@satra I'll give this a try: http://www.netreliant.com/news/8/17/Compacting-VirtualBox-Disk-Images-Linux-Guests.html
@satra Success on compacting the OVA. Your calculations are correct, the size of the file is 5.25GB.
I have posted it to the training website. Note the new name (resulting from needing to clone the original OVA file as part of the compaction process):
https://training.repronim.org/reprotraining.ova
This file has the R libraries and section 2 python packages included.
This file does not have the section 3 edits posted this morning. I have pulled those changes and they'll go in the next build.
Thank @mjtravers ! This image is still not expected to have datalad
inside, is that right?
@djarecka Datalad is installed in conda env section2
version 0.10.0-rc5
@jbpoline @cmtgreenwood
I opened again rstudio and tried to run scripts.
multiTesting.Rmd
returns error since it has some path set to C:/CeliaFiles...
.
The type1 script didn't return any errors, but I did not even try to validate if the output/plots are good.
@mjtravers @djarecka @satra - i released neurodocker version 0.4.0.
docker pull kaczmarj/neurodocker:0.4.0
yes knitr and markdown packages are needed to assemble a nice report.
However the R script parts could be run without this so may be much easier if I rewrite the scripts to be plain text.
So I will fix the path information probably tomorrow, and upload another version.
I will create plain text versions (i.e. *.R files) at the same time that do not need knitr and markdown
@cmtgreenwood Celia: I moved your scripts into section4/section41 and my notebook in section4/section42 I think Matt has included Rstudio and the libraries needed so - not sure we need to extract the R but I havent checked yet !
@jbpoline @cmtgreenwood Yes, the following R packages were installed on the VM:
... and R-Studio
@mjtravers - replacing issue #12 - just check off all the things you already have.
Core VM
general (see each section for additional local installs (listed under install)
FAIR Data - BIDS datasets
Computational basis
Neuroimaging Workflows
docker pull kaczmarj/neurodocker:master
) - will ask @kaczmarj to cut a new releaseStatistics for reproducibility
Others