neuropoly / intranet.neuro.polymtl.ca

NeuroPoly's lab manual
https://intranet.neuro.polymtl.ca
4 stars 6 forks source link

Centralize derivatives conventions for BIDS datasets #94

Closed valosekj closed 1 year ago

valosekj commented 1 year ago

Purpose

This PR intends to centralize the discussion about the derivatives/label conventions.

Motivation

Currently, many projects use their own derivatives convention, usually described in README, for example:

Description

This PR proposes the usage of _label-<region>_<task>.nii.gz tag, for example:

For "tasks" such as centerline, disc, or pmj, the region is omitted; for example:

The full description is provided within this PR here.

Also, this PR proposes the usage of derivatives/manual_labels (instead of derivatives/labels). Thanks to that, we can omit -manual from the filename of each file (i.e., sub-001_T1w_label-SC_seg-manual.nii.gz --> sub-001_T1w_label-SC_seg.nii.gz).

Pros

Cons

Previous discussions

Useful links

TODO

Questions

mariehbourget commented 1 year ago

Great initiative! I'll cross-ref some research that I did previously in ivadomed with BIDS derivatives that may be helpful for you to centralize the convention.

In particular, I asked a question on the BIDS mailing list for derivatives chains filenaming (derivatives of a derivatives) but unfortunately I did not receive any answer so far: https://groups.google.com/g/bids-discussion/c/6UDCso4mCXc/m/VvuG0Vk3CAAJ?utm_medium=email&utm_source=footer

Hope that helps!

NadiaBlostein commented 1 year ago

Hi, to me it's still unclear whether labels or manual_labels (in derivatives/labels or derivatives/manual_labels) should be pluralized or not. Is the consensus to keep it pluralized, as can be seen in the documentation? Thank you!

jcohenadad commented 1 year ago

Hi, to me it's still unclear whether labels or manual_labels (in derivatives/labels or derivatives/manual_labels) should be pluralized or not. Is the consensus to keep it pluralized, as can be seen in the documentation? Thank you!

Good question. I think BIDS examples tend to pluralize this. Example for "masks": https://bids-specification.readthedocs.io/en/latest/derivatives/imaging.html#masks

NadiaBlostein commented 1 year ago

@valosekj A couple additional questions:

  1. For disc level labels (what sct_label_vertebrae outputs with the suffix _seg_labeled_discs.{json, nii.gz}, should the "BIDSified" suffix be _label-disc_level.nii.gz ?

  2. Is it okay to put everything in the same labels directory (as opposed to labels and manual_labels) if one changes the suffixes for the files in manual_labels from label-<region>.nii.gz to manual_label-<region>.nii.gz and the suffixes for the files in manual_labels_softseg from label-<region>.nii.gz to manual_label_softseg-<region>.nii.gz

  3. A pedantic point about the BIDSification of these file names but which could be good to standardize "earlier on": should "label" in the file name suffixes also be pluralized? Or at this point, people can just read the documentation which is pretty clear as is.

Cheers!

valosekj commented 1 year ago

Sorry, @NadiaBlostein. Your first message slipped through the cracks due to my holiday.

Hi, to me it's still unclear whether labels or manual_labels (in derivatives/labels or derivatives/manual_labels) should be pluralized or not. Is the consensus to keep it pluralized, as can be seen in the documentation? Thank you!

I quickly checked several of our git-annexed datasets, and all use pluralized forms such as labels, labels_softseg, or manual_labels. So, yes, I think our consensus is to keep it pluralized, as can be seen in the documentation.

  1. For disc level labels (what sct_label_vertebrae outputs with the suffix _seg_labeled_discs.{json, nii.gz}, should the "BIDSified" suffix be _label-disc_level.nii.gz ?

I think our consensus is label-disc.nii.gz (i.e., without level).

  1. Is it okay to put everything in the same labels directory (as opposed to labels and manual_labels) if one changes the suffixes for the files in manual_labels from label-.nii.gz to manual_label-.nii.gz and the suffixes for the files in manual_labels_softseg from label-.nii.gz to manual_label_softseg-.nii.gz

Good question! We use -manual in file names like this:

data-multi-subject/derivatives/labels/sub-amu01/anat/sub-amu01_T1w_seg-manual.nii.gz

But based on the new convention, the valid approach is also:

data-multi-subject/derivatives/manual_labels/sub-amu01/anat/sub-amu01_T1w_seg.nii.gz

(i.e., using manual_labels instead of labels and omitting -manual from the file name)

  1. A pedantic point about the BIDSification of these file names but which could be good to standardize "earlier on": should "label" in the file name suffixes also be pluralized? Or at this point, people can just read the documentation which is pretty clear as is.

label entity in the file name should be singular; see BIDS documentation here

NadiaBlostein commented 1 year ago

Thank you @valosekj !

Continuing on with point 2: how is one then to differentiate disc labels (_label-disc.nii.gz) and intervertebral level labels (output by sct_label_vertebrae)? Could one add this suffix label-disc_level.nii.gz to the documentation?

According to the new convention, shouldn't data-multi-subject/derivatives/labels/sub-amu01/anat/sub-amu01_T1w_seg-manual.nii.gzbe data-multi-subject/derivatives/labels/sub-amu01/anat/sub-amu01_T1w_label-SC_seg-manual.nii.gz ?

Thank you for answering my questions!!!

valosekj commented 1 year ago

Continuing on with point 2: how is one then to differentiate disc labels (_label-disc.nii.gz) and intervertebral level labels (output by sct_label_vertebrae)? Could one add this suffix label-disc_level.nii.gz to the documentation?

Good point! We usually store only intervertebral disc labels (_label-disc.nii.gz). Then there is no need for differentiation of sct_label_vertebrae output 😅

According to the new convention, shouldn't data-multi-subject/derivatives/labels/sub-amu01/anat/sub-amu01_T1w_seg-manual.nii.gzbe data-multi-subject/derivatives/labels/sub-amu01/anat/sub-amu01_T1w_label-SC_seg-manual.nii.gz ?

Yes, you are right! spine-generic/data-multi-subject uses the "old convention" because it was initially curated >2 years ago. I documented it here. Thank you for this relevant point!