sibis-platform / ncanda-data-integration

This is the Data Integration, MRI, and Bioinformatics Component of the National Consortium on Alcohol and NeuroDevelopment in Adolescence (NCANDA), funded by the NIAAA.
https://www.nitrc.org/projects/ncanda-datacore
BSD 3-Clause "New" or "Revised" License
4 stars 10 forks source link

Allow empty projects and multiframe scans #497

Closed annehaley closed 2 years ago

annehaley commented 2 years ago

The first commit adds an argument project_list ot the command write_miqa_import_file. If there is data found for any project not included in this list argument, an exception will be raised and the file will not be written. If there is no data found for a project included in this list argument, this project will get an empty dataset in the import file and the file will be successfully written. When this file is used in a global import in MIQA, those empty projects will have their contents cleared.

The second commit changes the behavior of the convert_dataframe_to_new_format function such that scan directories with more than one image file will have each image file written as a frame within the same scan. Due to this directory-scanning behavior, frames will not be written for files that do not exist.


Below is an example usage:

The input csv file, using the old MIQA import format:

xnat_experiment_id,nifti_folder,scan_id,scan_type,experiment_note,decision,scan_note
NCANDA_E11482,/fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti,4,ncanda-t2fse-v1,"series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ",,""
NCANDA_E11482,/fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti,5,ncanda-t1spgr-v1,"series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ",,""
NCANDA_E11482,/fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti,7,ncanda-dti6b500pepolar-v1,"series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ",,""
NCANDA_E11482,/fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti,8,ncanda-dti60b1000-v1,"series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ",,""
NCANDA_E11482,/fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti,9,ncanda-dti30b400-v1,"series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ",,""
NCANDA_E11482,/fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti,13,ncanda-rsfmri-v1,"series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ",,""
NCANDA_E11482,/fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti,9010,ncanda-grefieldmap-v1,"series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ",,""

The output json file:

{
    "projects": {
        "DUKE": {
            "experiments": {
                "NCANDA_E11482": {
                    "scans": {
                        "NCANDA_E11482_ncanda-dti30b400-v1": {
                            "type": "ncanda-dti30b400-v1",
                            "frames": {
                                "9": {
                                    "file_location": "fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/9_ncanda-dti30b400-v1/image.nii.gz"
                                }
                            },
                            "subject_id": "",
                            "session_id": "",
                            "scan_link": "",
                            "last_decision": null
                        },
                        "NCANDA_E11482_ncanda-dti60b1000-v1": {
                            "type": "ncanda-dti60b1000-v1",
                            "frames": {
                                "8": {
                                    "file_location": "fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/8_ncanda-dti60b1000-v1/image.nii.gz"
                                }
                            },
                            "subject_id": "",
                            "session_id": "",
                            "scan_link": "",
                            "last_decision": null
                        },
                        "NCANDA_E11482_ncanda-dti6b500pepolar-v1": {
                            "type": "ncanda-dti6b500pepolar-v1",
                            "frames": {
                                "0": {
                                    "file_location": "fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image1.nii.gz"
                                },
                                "1": {
                                    "file_location": "fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image2.nii.gz"
                                },
                                "2": {
                                    "file_location": "fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image3.nii.gz"
                                },
                                "3": {
                                    "file_location": "fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image4.nii.gz"
                                },
                                "4": {
                                    "file_location": "fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image5.nii.gz"
                                },
                                "5": {
                                    "file_location": "fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image6.nii.gz"
                                },
                                "6": {
                                    "file_location": "fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image7.nii.gz"
                                },
                                "7": {
                                    "file_location": "fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image8.nii.gz"
                                }
                            },
                            "subject_id": "",
                            "session_id": "",
                            "scan_link": "",
                            "last_decision": null
                        },
                        "NCANDA_E11482_ncanda-grefieldmap-v1": {
                            "type": "ncanda-grefieldmap-v1",
                            "frames": {
                                "9010": {
                                    "file_location": "fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/9010_ncanda-grefieldmap-v1/image.nii.gz"
                                }
                            },
                            "subject_id": "",
                            "session_id": "",
                            "scan_link": "",
                            "last_decision": null
                        },
                        "NCANDA_E11482_ncanda-rsfmri-v1": {
                            "type": "ncanda-rsfmri-v1",
                            "frames": {
                                "13": {
                                    "file_location": "fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/13_ncanda-rsfmri-v1/image.nii.gz"
                                }
                            },
                            "subject_id": "",
                            "session_id": "",
                            "scan_link": "",
                            "last_decision": null
                        },
                        "NCANDA_E11482_ncanda-t1spgr-v1": {
                            "type": "ncanda-t1spgr-v1",
                            "frames": {
                                "5": {
                                    "file_location": "fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/5_ncanda-t1spgr-v1/image.nii.gz"
                                }
                            },
                            "subject_id": "",
                            "session_id": "",
                            "scan_link": "",
                            "last_decision": null
                        },
                        "NCANDA_E11482_ncanda-t2fse-v1": {
                            "type": "ncanda-t2fse-v1",
                            "frames": {
                                "4": {
                                    "file_location": "fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/4_ncanda-t2fse-v1/image.nii.gz"
                                }
                            },
                            "subject_id": "",
                            "session_id": "",
                            "scan_link": "",
                            "last_decision": null
                        }
                    },
                    "notes": "series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. "
                }
            }
        },
        "NCANDA": {
            "experiments": {}
        }
    }
}

OR the output CSV file:

project_name,experiment_name,scan_name,scan_type,frame_number,file_location,experiment_notes,subject_id,session_id,scan_link,last_decision,last_decision_creator,last_decision_note,last_decision_created,identified_artifacts,location_of_interest
DUKE,NCANDA_E11482,NCANDA_E11482_ncanda-t2fse-v1,ncanda-t2fse-v1,4,fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/4_ncanda-t2fse-v1/image.nii.gz,series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ,,,,,,,,,
DUKE,NCANDA_E11482,NCANDA_E11482_ncanda-t1spgr-v1,ncanda-t1spgr-v1,5,fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/5_ncanda-t1spgr-v1/image.nii.gz,series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ,,,,,,,,,
DUKE,NCANDA_E11482,NCANDA_E11482_ncanda-dti6b500pepolar-v1,ncanda-dti6b500pepolar-v1,0,fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image1.nii.gz,series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ,,,,,,,,,
DUKE,NCANDA_E11482,NCANDA_E11482_ncanda-dti6b500pepolar-v1,ncanda-dti6b500pepolar-v1,1,fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image2.nii.gz,series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ,,,,,,,,,
DUKE,NCANDA_E11482,NCANDA_E11482_ncanda-dti6b500pepolar-v1,ncanda-dti6b500pepolar-v1,2,fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image3.nii.gz,series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ,,,,,,,,,
DUKE,NCANDA_E11482,NCANDA_E11482_ncanda-dti6b500pepolar-v1,ncanda-dti6b500pepolar-v1,3,fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image4.nii.gz,series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ,,,,,,,,,
DUKE,NCANDA_E11482,NCANDA_E11482_ncanda-dti6b500pepolar-v1,ncanda-dti6b500pepolar-v1,4,fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image5.nii.gz,series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ,,,,,,,,,
DUKE,NCANDA_E11482,NCANDA_E11482_ncanda-dti6b500pepolar-v1,ncanda-dti6b500pepolar-v1,5,fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image6.nii.gz,series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ,,,,,,,,,
DUKE,NCANDA_E11482,NCANDA_E11482_ncanda-dti6b500pepolar-v1,ncanda-dti6b500pepolar-v1,6,fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image7.nii.gz,series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ,,,,,,,,,
DUKE,NCANDA_E11482,NCANDA_E11482_ncanda-dti6b500pepolar-v1,ncanda-dti6b500pepolar-v1,7,fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image8.nii.gz,series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ,,,,,,,,,
DUKE,NCANDA_E11482,NCANDA_E11482_ncanda-dti60b1000-v1,ncanda-dti60b1000-v1,8,fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/8_ncanda-dti60b1000-v1/image.nii.gz,series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ,,,,,,,,,
DUKE,NCANDA_E11482,NCANDA_E11482_ncanda-dti30b400-v1,ncanda-dti30b400-v1,9,fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/9_ncanda-dti30b400-v1/image.nii.gz,series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ,,,,,,,,,
DUKE,NCANDA_E11482,NCANDA_E11482_ncanda-rsfmri-v1,ncanda-rsfmri-v1,13,fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/13_ncanda-rsfmri-v1/image.nii.gz,series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ,,,,,,,,,
DUKE,NCANDA_E11482,NCANDA_E11482_ncanda-grefieldmap-v1,ncanda-grefieldmap-v1,9010,fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/9010_ncanda-grefieldmap-v1/image.nii.gz,series 14 - 15 and 16 need to be double-checked on what they really are. Andrew will get back to you on this. ,,,,,,,,,
NCANDA,,,,,,,,,,,,,,,
kipohl commented 2 years ago

so I can see already one issue that is

 "0": {
                                    "file_location": "/fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image5.nii.gz"
                                },
                                "1": {
                                    "file_location": "/fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image6.nii.gz"
                                },
                                "2": {
                                    "file_location": "/fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image7.nii.gz"
                                },
                                "3": {
                                    "file_location": "/fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image8.nii.gz"
                                },
                                "4": {
                                    "file_location": "/fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image1.nii.gz"
                                },
                                "5": {
                                    "file_location": "/fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image3.nii.gz"
                                },
                                "6": {
                                    "file_location": "/fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image4.nii.gz"
                                },
                                "7": {
                                    "file_location": "/fs/storage/XNAT/archive/duke_incoming/arc001/C-70183-F-1-20220407/RESOURCES/nifti/7_ncanda-dti6b500pepolar-v1/image2.nii.gz"
                                }

the frames are not in the right order e.g. frame 7 is image2 instead of image8

kipohl commented 2 years ago

also what about the other sequences that have multiple images such as ncanda-dti60b1000-v1 @annehaley

annehaley commented 2 years ago

Ok, I can make the ordering explicitly alphabetical. Good catch. Change made in 2ca9fd2.

As for the other folders that contain more than one file, they will also have mutliple frames. In my example, I only copied the structure of the one directory you sent me, so only the one scan has multiple frames. The function looks at the contents of each scan folder.

kipohl commented 2 years ago

@annehaley ok - can you please provide me the updated output with the latest commit

annehaley commented 2 years ago

@annehaley ok - can you please provide me the updated output with the latest commit

I updated the description above - it includes the updated output