ImagingDataCommons / TCIA-IDC-Coordination

1 stars 1 forks source link

[TH-46321] Collection-level metadata for the analysis results collections #17

Closed fedorov closed 3 years ago

fedorov commented 4 years ago

https://help.cancerimagingarchive.net/servicedesk/customer/portal/1/TH-46321

image

kirbyju commented 4 years ago

Agreed. I am not sure how to deal with that in the short term for things that are not DICOM data. Some Analysis Result datasets are NIFTI or some other format of segmentations/ROIs. Others are purely spreadsheets of radiomic feature output, or radiologist assessments in spreadsheets or XML format. Supporting all this non-DICOM stuff in an actual database you can query might be a PRISM feature request for them to address in the medium/long term. Alternatively the IDC team could convert all these things into some form of DICOM and we could then easily load them into NBIA.

If we went the DICOM conversion route, I guess the next question would be what study-level data do you want to pull? The NBIA database might need to be extended to support these in some cases.

Justin Kirby (contractor) Technical Project Manager, Frederick National Laboratory for Cancer Research Technical Director, Cancer Imaging Informatics Lab ORCiD: https://orcid.org/0000-0003-3487-8922 240-276-6016 justin.kirby@nih.govmailto:kirbyju@mail.nih.gov


From: Andrey Fedorov notifications@github.com Sent: Thursday, September 10, 2020 5:29 PM To: ImagingDataCommons/TCIA-IDC-Coordination TCIA-IDC-Coordination@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [ImagingDataCommons/TCIA-IDC-Coordination] Collection-level metadata for the analysis results collections (#17)

[image]https://user-images.githubusercontent.com/313942/92809959-17970b80-f38b-11ea-8729-f9b902045581.png

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/ImagingDataCommons/TCIA-IDC-Coordination/issues/17, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AASE6B4ACDGXVY4R3QMEJ4TSFFAL3ANCNFSM4RF5DKKA.

fedorov commented 4 years ago

I would focus on those that are in DICOM. The reality is that right now analysis results that are in DICOM are already available via API, but they are not supported with all the features that are implemented for the primary collections. The idea is to implement their support to be on par with the primary collections.

fedorov commented 4 years ago

Update from @ulrikew

Hi Andrey,

The TCIA and NBIA teams have added DOI for the not “third-party” collections. As before, the DOI is stored in the descriptionURI field.

The IDC team can use the API to retrieve the information. Below are the instructions provided by Scott.

Please reach out to Justin and/or Kirk for data-related questions and to Scott for software questions.

Ulli

DOIs/ descriptionURIs can be retrieved directly via seriesUIDs.

curl -H "Authorization:Bearer b5642ee1-c3f7-48d9-b694-2504bbea16f1" -k "https://public.cancerimagingarchive.net/nbia-api/services/getStudyDrillDownWithSeriesIds" -d "list=1.3.6.1.4.1.14519.5.2.1.7695.4001.306204232344341694648035234440&list=1.3.6.1.4.1.14519.5.2.1.7695.4001.180700359927709468630440576839"

[

{

  "studyId":"1.3.6.1.4.1.14519.5.2.1.7695.4001.130563880911723253267280582465",

  "date":913096800000,

  "description":"MSTEALTH",

  "id":163840,

  "study_id":null,

  "seriesList":[

     {

        "seriesNumber":"1",

        "seriesUID":"1.3.6.1.4.1.14519.5.2.1.7695.4001.180700359927709468630440576839",

        "numberImages":46,

        "modality":"MR",

        "manufacturer":"GE MEDICAL SYSTEMS",

        "annotationsFlag":false,

        "annotationsSize":0,

        "patientId":"TCGA-08-0244",

        "patientPkId":"131072",

        "studyId":"1.3.6.1.4.1.14519.5.2.1.7695.4001.130563880911723253267280582465",

        "studyPkId":163840,

        "totalSizeForAllImagesInSeries":6129768,

        "project":"TCGA-GBM",

        "description":"FMPSPGR SAG",

        "dataProvenanceSiteName":null,

        "manufacturerModelName":null,

        "softwareVersion":null,

        "maxFrameCount":"0",

        "studyDate":null,

        "studyDesc":null,

        "bodyPartExamined":"BRAIN",

        "study_id":null,

        "thirdPartyAnalysis":null,

        "descriptionURI":"https://doi.org/10.7937/K9/TCIA.2016.RNYFUYE9",

        "seriesId":"1.3.6.1.4.1.14519.5.2.1.7695.4001.180700359927709468630440576839",

        "studyDateString":"",

        "exactSize":6129768,

        "seriesPkId":229377

     },

     {

        "seriesNumber":"2",

        "seriesUID":"1.3.6.1.4.1.14519.5.2.1.7695.4001.306204232344341694648035234440",

        "numberImages":124,

        "modality":"MR",

        "manufacturer":"GE MEDICAL SYSTEMS",

        "annotationsFlag":false,

        "annotationsSize":0,

        "patientId":"TCGA-08-0244",

        "patientPkId":"131072",

        "studyId":"1.3.6.1.4.1.14519.5.2.1.7695.4001.130563880911723253267280582465",

        "studyPkId":163840,

        "totalSizeForAllImagesInSeries":16524014,

        "project":"TCGA-GBM",

        "description":"3DSPGR AXIAL",

        "dataProvenanceSiteName":null,

        "manufacturerModelName":null,

        "softwareVersion":null,

        "maxFrameCount":"0",

        "studyDate":null,

        "studyDesc":null,

        "bodyPartExamined":"BRAIN",

        "study_id":null,

        "thirdPartyAnalysis":null,

        "descriptionURI":"https://doi.org/10.7937/K9/TCIA.2016.RNYFUYE9",

        "seriesId":"1.3.6.1.4.1.14519.5.2.1.7695.4001.306204232344341694648035234440",

        "studyDateString":"",

        "exactSize":16524014,

        "seriesPkId":229376

     }

  ]

}

]

fedorov commented 4 years ago

from @kirbyju:

Collection level metadata should all be in Datacite and accessible via their API: https://support.datacite.org/docs/api. I know our curation folks are loading abstracts into that system now....there may be some old datasets we have to go back and modify, but I'd be willing to ask them to do that if that occurs anywhere. We should probably think about integrating this into the API gateway or our documentation in some form. You're the first group that's actually had a use case to do this though.