nmind / hackathon2021

Issues and notes for the May 2021 NMinD hackathon
1 stars 0 forks source link

Organize a benchmark dataset for fmri analysis #4

Open gkiar opened 3 years ago

gkiar commented 3 years ago

Ideally including:

gkiar commented 3 years ago

Related to https://github.com/nmind/hackathon2021/issues/3

poldrack commented 3 years ago

the NARPS dataset (https://openneuro.org/datasets/ds001734/versions/1.0.5) might be good for this...

On Mon, May 17, 2021 at 8:33 AM Greg Kiar @.***> wrote:

Related to #3 https://github.com/nmind/hackathon2021/issues/3

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nmind/hackathon2021/issues/4#issuecomment-842422389, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUVEBA2NBBUGQTXIDP2RDTOEZMPANCNFSM44WTUCTQ .

-- Russell A. Poldrack Albert Ray Lang Professor of Psychology Associate Director, Stanford Data Science Director, SDS Center for Open and Reproducible Science Building 420 Stanford University Stanford, CA 94305

@.*** http://www.poldracklab.org/

gkiar commented 3 years ago

Thanks, @poldrack!

@TingsterX, is there a publicly available primate fmri dataset which could be used for benchmarking tools?

When thinking about #3, which focuses specifically on benchmarks for skull extraction on structural data, does anybody have an idea of an existing collection which has ground truth segmentations for either human or primate? I may push this question to the Twitter-verse...

gkiar commented 3 years ago

Also, cc: @hough129

audreymhoughton commented 3 years ago

I have started making a test dataset with the studies listed below with a few subjects each for something else (or maybe related to this) already. These datasets have / are going to have BIDS input, processed outputs from the abcd-hcp-pipeline, and derivatives (we haven't decided what those derivatives specifically entail).

These currently live on Box.

ABCD (one for each scanner type - two scanner types are ready) HBN (5 subjects - almost ready - still uploading processed) PNC (not processed yet - need to modify pipeline) HCP-D (ready to go - two subjects) NKI-Rockland (have not processed yet - need BIDS inputs)

gkiar commented 3 years ago

@hough129 for NKI data: http://fcon_1000.projects.nitrc.org/indi/enhanced/access.html

arueter1 commented 3 years ago

Temporary storage location: S3 bucket. Goals for future storage location: Loris study at MSI at UMN so that the raw dicoms / derivatives / subject data is easier to download whenever we want to share it. This won't be ready to go until later this year (Oct 2021?).

Action Items: Are the datasets that Audrey listed above public? Any use considerations? QSIPrep still needs to be run on some diffusion data.

TingsterX commented 3 years ago

For NHPs, check out the PRIME-DE (https://fcon_1000.projects.nitrc.org/indi/indiPRIME.html). The Oxford dataset has 20 macaques (~50min per monkey); UCdavis has 19 macaques, shorter fMRI scans but with higher-resolution of T1 and T2.

For the brain extraction dataset. Cameron had one human manually edited dataset open. https://academic.oup.com/gigascience/article/5/1/s13742-016-0150-5/2737425?login=true

Recently, we just published a tool using a transfer-learning framework that trained the U-Net model on the human dataset and upgraded it with the macaque data. It also works for other species, e.g. chimps, marmosets, and pigs as well. https://github.com/HumanBrainED/NHP-BrainExtraction

gkiar commented 3 years ago

Also adding @engfranco to the thread

engfranco commented 3 years ago

I have started making a test dataset with the studies listed below with a few subjects each for something else (or maybe related to this) already. These datasets have / are going to have BIDS input, processed outputs from the abcd-hcp-pipeline, and derivatives (we haven't decided what those derivatives specifically entail).

These currently live on Box.

ABCD (one for each scanner type - two scanner types are ready) HBN (5 subjects - almost ready - still uploading processed) PNC (not processed yet - need to modify pipeline) HCP-D (ready to go - two subjects) NKI-Rockland (have not processed yet - need BIDS inputs)

Folks, let me know if you have any questions about the NKI-Rockland or HBN datasets. If you need 5 good data from the NKI-Rockland dataset, I recommend using these 5 that have low motion: sub-A00056703/ses-BAS1 sub-A00055906/ses-BAS1 sub-A00075732/ses-BAS1 sub-A00034073/ses-BAS1 sub-A00063006/ses-BAS1

Links to the S3 bucket of the whole imaging dataset organized in BIDS can be seen here: http://fcon_1000.projects.nitrc.org/indi/enhanced/aws_links.csv

arueter1 commented 3 years ago

Thanks Alexandre. One quick thing: @engfranco I'm not sure that we should have subject IDs out on a public facing website. Maybe we can share that internally to this team somehow (maybe via email?).

gkiar commented 3 years ago

For skull stripping, this looks awesome: http://preprocessed-connectomes-project.org/NFB_skullstripped/index.html

gkiar commented 3 years ago

Also from @TingsterX :

chimps: https://www.chimpanzeebrain.org marmoset: https://brainatlas.brain.riken.jp/marmoset_html

engfranco commented 3 years ago

Thanks Alexandre. One quick thing: @engfranco I'm not sure that we should have subject IDs out on a public facing website. Maybe we can share that internally to this team somehow (maybe via email?).

No need to worry. All these subject ID's are already public facing IDs and are available to anyone accessing the NKI-RS website. I'm not sharing anything that isn't already in the public domain. We have internal ID's for these participants as well.

audreymhoughton commented 3 years ago

I have started making a test dataset with the studies listed below with a few subjects each for something else (or maybe related to this) already. These datasets have / are going to have BIDS input, processed outputs from the abcd-hcp-pipeline, and derivatives (we haven't decided what those derivatives specifically entail). These currently live on Box. ABCD (one for each scanner type - two scanner types are ready) HBN (5 subjects - almost ready - still uploading processed) PNC (not processed yet - need to modify pipeline) HCP-D (ready to go - two subjects) NKI-Rockland (have not processed yet - need BIDS inputs)

Folks, let me know if you have any questions about the NKI-Rockland or HBN datasets. If you need 5 good data from the NKI-Rockland dataset, I recommend using these 5 that have low motion: sub-A00056703/ses-BAS1 sub-A00055906/ses-BAS1 sub-A00075732/ses-BAS1 sub-A00034073/ses-BAS1 sub-A00063006/ses-BAS1

Links to the S3 bucket of the whole imaging dataset organized in BIDS can be seen here: http://fcon_1000.projects.nitrc.org/indi/enhanced/aws_links.csv

Is there anything I need to do to be able to access this bucket?

gkiar commented 3 years ago

@hough129 I believe it is public, if you use the --no-sign-request flag with aws s3

gkiar commented 3 years ago

https://www.researchgate.net/publication/323955049_The_UNCUMN_Baby_Connectome_Project_BCP_An_overview_of_the_study_design_and_protocol_development

ltetrel commented 3 years ago

Our old document on selecting openneuro datasets: https://docs.google.com/document/d/16xjAPvcbFs1dWFozvpwpoAky8JWmmfDFppDctoNIWrc/edit#heading=h.a8bx6kg8xh6y