PennLINC / fw-heudiconv

Heuristic-based Data Curation on Flywheel
BSD 3-Clause "New" or "Revised" License
6 stars 11 forks source link

Add metadata with "provenance" (version) of fw-heudiconv used #102

Open yarikoptic opened 6 months ago

yarikoptic commented 6 months ago

I have tried to find a dataset among openneuro ones which was generated with fw-heudiconv. Either there is none or there is no annotation of any kind by fw-heudiconv that it is a tool which was used:

(git)smaug:/mnt/datasets/datalad/crawl/openneuro[master]git
$> grep -i flywheel */* 2>/dev/null
ds004413/README.md:These files were obtained from Flywheel, defaced and run through fMRIPrep. 
ds004450/README.md:These files were obtained from Flywheel, defaced and run through fMRIPrep. 

$> grep -i fw-heudiconv */* 2>/dev/null

FWIW in heudiconv we since recent https://github.com/nipy/heudiconv/pull/529 we have been adding metadata on heudiconv used:

(git)smaug:/mnt/datasets/datalad/crawl/openneuro[master]git
$> grep -i HeudiconvVer */* 2>/dev/null
ds004274/task-PSAP_bold.json:  "HeudiconvVersion": "0.11.3+d20220512",
ds004331/task-localizer_bold.json:  "HeudiconvVersion": "0.11.3+d20220512",
ds004331/task-main_bold.json:  "HeudiconvVersion": "0.11.3+d20220512",
ds004466/task-rest_bold.json:  "HeudiconvVersion": "0.11.3",
ds004693/task-localizer_bold.json:  "HeudiconvVersion": "0.13.1",
ds004693/task-main_bold.json:  "HeudiconvVersion": "0.13.1",

so we are adding it per each converted subject/session and then task-* files on top level aggregate common metadata fields across all subject/sessions and thus appear on the top level.

It is not a standard BIDS term. Since then the

Apparently there is already a good number of openneuro datasets which use GeneratedBy ```shell $ for f in ds00*/dataset_description.json ; do grep -q GeneratedBy $f && {echo $f; jq .GeneratedBy $f}; done | tee /tmp/generated-bys ds003653/dataset_description.json [ { "Name": "heudiconv/reproin", "Version": "0.5.4", "Container": { "Type": "singularity", "Url": "https://github.com/ReproNim/containers/blob/master/images/repronim/repronim-reproin--0.5.4.sing" } }, { "Name": "dcm2niix", "Version": "v1.0.20190410 (JP2:OpenJPEG) GCC6.3.0" } ] ds003682/dataset_description.json [ { "Name": "Custom Python scripts", "Description": "Custom python scripts" } ] ds003929/dataset_description.json [ { "Name": "BIDScoin", "Version": "3.6.3", "CodeURL": "https://github.com/Donders-Institute/bidscoin" } ] ds004161/dataset_description.json [ { "Name": "BIDScoin", "Version": "3.7.2", "CodeURL": "https://github.com/Donders-Institute/bidscoin" } ] ds004401/dataset_description.json [ { "Name": "petsurfer", "Version": "freesurfer-linux-centos7_x86_64-7.3.2-20220804-6354275" } ] ds004512/dataset_description.json [ { "Name": "Manual" } ] ds004632/dataset_description.json [ { "Name": "FSL ; ANTs ", "Version": "v3.0.18 ; 2.1.0" } ] ds004654/dataset_description.json [ { "Name": "MathWorks MATLAB niftiwrite", "Description": "Used to convert the raw DICOM data to NIfTI format.", "CodeURL": "https://www.mathworks.com/help/images/ref/niftiwrite.html" }, { "Name": "FreeSurfer's MiDeFace", "Version": "7.3.2", "Description": "A tool for defacing MRI images in a way that is both minimally invasive and achieves goals of privacy", "CodeURL": "https://surfer.nmr.mgh.harvard.edu/fswiki/MiDeFace" } ] ds004697/dataset_description.json [ { "Name": "dcm2bids", "Version": "2.1.6", "Description": "Used to convert and organize dicoms into BIDS" }, { "Name": "Manual", "Description": "Added 'IntendedFor' and 'TaskName' keys to fmap and func .json files, respectively" }, { "Name": "pydeface", "Version": "2.0.2", "Description": "Used to de-face MPRage scans" } ] ds004717/dataset_description.json [ { "Name": "dcm2bids", "Version": "3.0.1", "Description": "Used to convert and organize dicoms into BIDS" }, { "Name": "pydeface", "Version": "2.0.2", "Description": "Used to de-face anatomical scans" } ] ds004730/dataset_description.json [ { "Name": "MathWorks MATLAB niftiwrite", "Description": "Used to convert the raw DICOM data to NIfTI format.", "CodeURL": "https://www.mathworks.com/help/images/ref/niftiwrite.html" }, { "Name": "FreeSurfer's MiDeFace", "Version": "7.3.2", "Description": "A tool for defacing MRI images in a way that is both minimally invasive and achieves goals of privacy", "CodeURL": "https://surfer.nmr.mgh.harvard.edu/fswiki/MiDeFace" } ] ds004731/dataset_description.json [ { "Name": "MathWorks MATLAB niftiwrite", "Description": "Used to convert the raw DICOM data to NIfTI format.", "CodeURL": "https://www.mathworks.com/help/images/ref/niftiwrite.html" }, { "Name": "FreeSurfer's MiDeFace", "Version": "7.3.2", "Description": "A tool for defacing MRI images in a way that is both minimally invasive and achieves goals of privacy", "CodeURL": "https://surfer.nmr.mgh.harvard.edu/fswiki/MiDeFace" } ] ds004733/dataset_description.json [ { "Name": "MathWorks MATLAB niftiwrite", "Description": "Used to convert the raw DICOM data to NIfTI format.", "CodeURL": "https://www.mathworks.com/help/images/ref/niftiwrite.html" }, { "Name": "FreeSurfer's MiDeFace", "Version": "7.3.2", "Description": "A tool for defacing MRI images in a way that is both minimally invasive and achieves goals of privacy", "CodeURL": "https://surfer.nmr.mgh.harvard.edu/fswiki/MiDeFace" } ] ds004866/dataset_description.json [ { "Name": "ezBIDS", "Version": "1.0.0", "Description": "ezBIDS is a web-based tool for converting neuroimaging datasets to BIDS, requiring neither coding nor knowledge of the BIDS specification", "CodeURL": "https://brainlife.io/ezbids/", "Container": { "Type": "docker", "Tag": "brainlife/ezbids-handler" } } ] ds004884/dataset_description.json [ { "Name": "Manual" } ] ```

so apparently some tools already do it out of the box (e.g. bidscoin), and I filed similar issue within heudiconv to make use of GeneratedBy: