Download datasets for analysis

ME-ICA / tedana-comparison

Comparison of implementations of multi-echo fMRI denoising pipelines across datasets.

GNU General Public License v2.0

4 stars 1 forks source link

Download datasets for analysis #1

Open tsalo opened 6 years ago

tsalo commented 6 years ago

Per discussion on gitter and in the tedana testing Google Doc, we want to download a set of datasets from OpenNeuro with a range of parameters to check their results across pipeline implementations. Here is a list of datasets we can use (to be updated as new ones are added to OpenNeuro):

dataset	echoes	TR	Tesla	Voxel	MB	task
ds000210	3	2.00	3	3.75x3.75x3.80	N	rest/event
ds000254	4	4.00	3	3.00x3.00x3.00	4	block
ds000258	4	2.47	3	3.75x3.75x4.40	N	rest
ds001491	3	2.45	3	3.50x3.50x3.50	N	event
CamCAN	5	2.47	3	3.00x3.00x4.44	N	film

Outstanding questions:

How can we download the data programmatically?
Should we download and run a single subject/run/task or multiple from each dataset?

tsalo commented 6 years ago

It looks like we can use nistats to download data from OpenNeuro!

emdupre commented 5 years ago

Nice ! Coming back to this, do you think we should test on ds000254 as our third test in tedana ? Since we're using circle workflows we can run all of our datasets in parallel, and it seems the most distinct from what we have already !

EDIT: I'm actually not sure that would be a good idea, since it's also pCASL. @handwerkerd is likely to have a MB dataset we could use, though.

handwerkerd commented 5 years ago

Something with MB would be good, but I don't think I have any multi-band multi-echo data from an actual study sitting around. I've just piloted a few sequence options with this. I plan to collect some of these data in the near-ish future, but if you want a dataset now, you'll need to get it from somewhere else.

tsalo commented 5 years ago

@emdupre I think that ds000254 would be a good choice, largely because it includes a well-characterized task. We can use motor cortex ROIs to look at betas across pipelines. On the other hand, ds001491 uses an IAPS event-related task, so we could do the same thing with a V1 seed if the simultaneous pCASL is an issue.

tsalo commented 5 years ago

We can also request access to the Cam-CAN film viewing dataset and treat it like resting-state data. That dataset has five echoes and (I believe) a large number of participants.

tsalo commented 4 years ago

I was given access to the Cam-CAN dataset. There are 649 participants ranging in age from 18.5 to 88.9 years old, with an average of 54.8 yrs. Each has one run of five-echo film-viewing data. My understanding of the usage requirements, based on the uses I specified in my application, is that we can use it for tedana validation/comparison-type papers as long as we include the proper acknowledgements and citations.