Closed mih closed 6 years ago
perhaps we can chat about this early this week. specifically with respect to the examples in the following plan:
Introduction to reproducible neuroimaging: motivations
David Kennedy, University of Massachusetts, United States
8:30-10:00
FAIR Data - BIDS datasets
Jeffrey Grethe [presenting] and Maryann Martone, UCSD, United States
talk 1: Intro to FAIR
exercise: 16 attributes of FAIR - e.g. Is there a clear license, what is a PID, What is meant by metadata, …
link attributes for 2 modules below
talk 2: Standardization and BIDS
exercise: dicom to BIDS conversion exercise: basic conversion (tie in w/ ReproIn in next section)
talk 3: FAIR Metadata: searching and using Scicrunch
exercise: BIDS metadata - participants.tsv and semantic annotation
talk 4: Brief Intro to NIDM
exercise: NIDM conversion tool to create sidecar file
10:00-10:15 coffee break
10:15-11:45
Computational basis
Yaroslav Halchenko, Dartmouth College, United States and Michael Hanke, Magdeburg Germany
talk 1: ReproIn : More on this?
Exercise:
talk 2: Git/GitAnnex/DataLad:
Exercise:
talk 3: Everything Else
Exercise:
12:00-13:00 Lunch
13:00-14:30 Neuroimaging Workflows
Dorota Jarecka and Satrajit Ghosh, MIT, United States, Camille Maumet, INRIA, France
talk 1: ReproFlow: Reusable scripts and environments, PROV
Exercise: Run, rinse, and repeat
talk 2: ReproEnv: Virtual machines/ContainersReproPaper, NIDM components
Exercise: Create different environments
[talk 3: ReproTest: Variability sources (analysis models, operating systems, software versions)]
Exercise: Run analysis with different environments
14:30-14:45 Break
14:45-16:00 Statistics for reproducibility
Celia Greenwood, McGill University, Canada and Jean-Baptiste Poline, McGill University, Canada
Assumes we have a csv file with say 100 subjects and columns like: “age, sex, pheno1, pheno2… “
talk 1: evil p-value : what they are - and are not
Exercise: test with
talk 2:
Exercise:
talk 3:
Exercise:
16:00-16:30 Conclusion & Getting Feedback
Nina Preuss, Preuss Enterprises, United States
i think we need to clarify and enhance each exercise within the next week or two, and have multiple people go through the exercises well before the session.
with respect to images, perhaps we can do either:
a. several small images for each task (the granularity of the task can be established separately) b. one single image for everything
(a) is my current preference since it associates a small reusable component and allows easier maintenance of the image as software pieces change.
@satra I was aiming to fill the void of Yarik's and my exercises first .
re images: I am going for small ones (i.e. A). I see no advantage of B.
@mih - sounds good to me. i think there is some amount of redoing across exercises. just wanted to make sure we have a coherent picture.
we will try to finish the exercises for section 3 this coming week together with the talk outlines.
I see one disadvantage of A - we might end up with people who are running multiple containers at the same time and executing things in a wrong one.
@djarecka If you take a look the latest demo script you can see how much of container selection people would have to make in the datalad world:
https://github.com/mih/ohbm2018-training/blob/master/fsl_glm_w_amazing_datalad.sh
Pretty much none. One step, one dataset, one container.
@mih - ok, I didn't realize that people will run one script with docker run inside. I will read carefully and test it this week!
https://myyoda.github.io/module-datalad/03-01-reproin/
Pretty much done.
Just FYI, stop me if this is all wrong.
Stuff is coming in via #4