neuropoly / data-management

Repo that deals with datalad aspects for internal use
4 stars 0 forks source link

datasets/mni-bmpd - addition of the BIDSified data #194

Open rohanbanerjee opened 1 year ago

rohanbanerjee commented 1 year ago

I have added the BIDSifed data for the mni-bmpd data along (contains both images and segmentations). It is in the branch rb/initial-data @mguaypaq Can you please review this?

mguaypaq commented 1 year ago

I can correctly get the files with git annex get, but bids-validator . is unhappy, with these errors/warnings:

    1: [ERR] You have to define 'RepetitionTime' for this file. (code: 10 - REPETITION_TIME_MUST_DEFINE)
        ./sub-P002/func/sub-P002_task-rest_bold.nii.gz
        ... and 95 more files having this issue (Use --verbose to see them all).

    2: [ERR] You have to define 'TaskName' for this file. (code: 50 - TASK_NAME_MUST_DEFINE)
        ./sub-P002/func/sub-P002_task-rest_bold.nii.gz
        ... and 95 more files having this issue (Use --verbose to see them all).

    3: [ERR] Bold scans must be 4 dimensional. (code: 54 - BOLD_NOT_4D)
        ./sub-P002/func/sub-P002_task-rest_bold.nii.gz
            Evidence: header field "dim" = 3,120,120,32
        ... and 95 more files having this issue (Use --verbose to see them all).

    4: [ERR] NIfTI file's header is missing time dimension information. (code: 75 - NIFTI_PIXDIM4)
        ./sub-P002/func/sub-P002_task-rest_bold.nii.gz
        ... and 95 more files having this issue (Use --verbose to see them all).

    1: [WARN] You should define 'SliceTiming' for this file. If you don't provide this information slice time correction will not be possible. 'Slice Timing' is the time at which each slice was acquired within each volume (frame) of the acquisition. Slice timing is not slice order -- rather, it is a list of times containing the time (in seconds) of each slice acquisition in relation to the beginning of volume acquisition. (code: 13 - SLICE_TIMING_NOT_DEFINED)
        ./sub-P002/func/sub-P002_task-rest_bold.nii.gz
        ... and 95 more files having this issue (Use --verbose to see them all).

I'm not sure what the errors mean exactly, but I think the BOLD contrast and RepetitionTime and TaskName are mentioned in the BIDS spec here. I suspect this is because the accompanying .json sidecar files all contain {} rather than containing any fields.

Let me know once you've addressed these problems (possibly by adding extra commits to your branch rb/initial-data) and I'll have another look.

rohanbanerjee commented 1 year ago

1: [ERR] You have to define 'RepetitionTime' for this file. (code: 10 - REPETITION_TIME_MUST_DEFINE) ./sub-P002/func/sub-P002_task-rest_bold.nii.gz ... and 95 more files having this issue (Use --verbose to see them all).

2: [ERR] You have to define 'TaskName' for this file. (code: 50 - TASK_NAME_MUST_DEFINE) ./sub-P002/func/sub-P002_task-rest_bold.nii.gz ... and 95 more files having this issue (Use --verbose to see them all).

I have added the needed fields in the corresponding json files of the images.

3: [ERR] Bold scans must be 4 dimensional. (code: 54 - BOLD_NOT_4D) ./sub-P002/func/sub-P002_task-rest_bold.nii.gz Evidence: header field "dim" = 3,120,120,32 ... and 95 more files having this issue (Use --verbose to see them all).

4: [ERR] NIfTI file's header is missing time dimension information. (code: 75 - NIFTI_PIXDIM4) ./sub-P002/func/sub-P002_task-rest_bold.nii.gz ... and 95 more files having this issue (Use --verbose to see them all).

The data was preprocessed before I had received it because of which these specific details are missing.

1: [WARN] You should define 'SliceTiming' for this file. If you don't provide this information slice time correction will not be possible. 'Slice Timing' is the time at which each slice was acquired within each volume (frame) of the acquisition. Slice timing is not slice order -- rather, it is a list of times containing the time (in seconds) of each slice acquisition in relation to the beginning of volume acquisition. (code: 13 - SLICE_TIMING_NOT_DEFINED) ./sub-P002/func/sub-P002_task-rest_bold.nii.gz ... and 95 more files having this issue (Use --verbose to see them all).

Added this field in the json too.

With the above changes, the latest commit (commit:c9cb4d7207dce2d6666ae9746052158f38954a5e) adds the following changes:

  1. Preprocessing by resizing the images based on the ground truth dimensions in the dataset.
  2. Manual correction of a ground truth (segmentation) - the name of the rater is included in the json files of the corresponding corrected segmentation files.
jcohenadad commented 1 year ago

@rohanbanerjee I don't see your commit:

ed6a4133efe4a5352236ba0a31d9000f0c89eaf6 (HEAD -> master, origin/master, origin/HEAD) Configure git-annex
julien-macbook:~/data.neuro/mni-bmpd $ git fetch
julien-macbook:~/data.neuro/mni-bmpd $ git checkout rb/initial-data
Branch 'rb/initial-data' set up to track remote branch 'rb/initial-data' from 'origin'.
Switched to a new branch 'rb/initial-data'
julien-macbook:~/data.neuro/mni-bmpd $ gl
007574ae9a42c0d61cabf212ce3a8dfcab57a32e (HEAD -> rb/initial-data, origin/rb/initial-data) initial data
ed6a4133efe4a5352236ba0a31d9000f0c89eaf6 (origin/master, origin/HEAD, master) Configure git-annex
jcohenadad commented 1 year ago

3: [ERR] Bold scans must be 4 dimensional. (code: 54 - BOLD_NOT_4D) ./sub-P002/func/sub-P002_task-rest_bold.nii.gz Evidence: header field "dim" = 3,120,120,32 ... and 95 more files having this issue (Use --verbose to see them all).

The data was preprocessed before I had received it because of which these specific details are missing.

No. "_bold" means this is a bold dataset, hence it should be 4D. Preprocessing is not the reason for the error.

jcohenadad commented 1 year ago

1: [WARN] You should define 'SliceTiming' for this file. If you don't provide this information slice time correction will not be possible. 'Slice Timing' is the time at which each slice was acquired within each volume (frame) of the acquisition. Slice timing is not slice order -- rather, it is a list of times containing the time (in seconds) of each slice acquisition in relation to the beginning of volume acquisition. (code: 13 - SLICE_TIMING_NOT_DEFINED) ./sub-P002/func/sub-P002_task-rest_bold.nii.gz ... and 95 more files having this issue (Use --verbose to see them all).

Added this field in the json too.

And what value did you give?

jcohenadad commented 1 year ago

Preprocessing by resizing the images based on the ground truth dimensions in the dataset.

This is quite problematic because it transforms the original data-- if this transformation is non-diffeomorphic we loose information. And resampling usually imply information loss. What resizing did you do? We always need the full syntax.

mguaypaq commented 1 year ago

@rohanbanerjee, I'm just going through my old assigned issues that are still open, is there any update on this one?