ReproNim / reproin

A setup for automatic generation of shareable, version-controlled BIDS datasets from MR scanners
MIT License
47 stars 14 forks source link

"Unassigned" output #57

Open jmatthews-rotman opened 2 years ago

jmatthews-rotman commented 2 years ago

Not sure if this is more of a reproin or heudiconv question, as I'm new to both.

At the moment, when I run heudiconv with this command: heudiconv -f reproin --bids -o ${bidspath}/ --files ${temppath}/ the output is always to the same "Unassigned" project folder in my specified output folder, including data from projects with different StudyID and StudyName. I assumed from the inclusion of a StudyID and StudyName in the tree structure that these would be used to sort outputs into relevant BIDS project folders. Just checking if this is as intended, or if I'm missing something.

The two areas I'm currently deviating from the suggested instructions are: -Not filling out Ascension field (was unavailable when we were on VE11E, haven't yet adopted although available on XA30) -Not using datalad flag in heudiconv command (just haven't figured out datalad yet) In case either of these could impact the sorting of output data by project. Otherwise, my scan tree and sequences are named as suggested.

Thank you.

yarikoptic commented 2 years ago

What is the value for StudyDescription in those DICOMs?

e.g. I see

❯ dcmdump proj/heudiconv/dartmouth-phantoms/bids_test4-20161014/phantom-1/anat-T1w/1.3.12.2.1107.5.2.43.66112.2016101409261110045101885.dcm| grep StudyDe
(0008,1030) LO [Halchenko_Yarik^950_bids_test4]         #  30, 1 StudyDescription

which got filled in automagically when I chose specific STUDY as shown in https://github.com/ReproNim/reproin/blob/master/docs/walkthrough-1.md#choose-the-desired-program . May be someone manually entered Unspecified there?

and then that field is picked up here: https://github.com/nipy/heudiconv/blob/HEAD/heudiconv/dicoms.py#L82 . ... anyways -- please check what you have at the levelof DICOM first.

jmatthews-rotman commented 2 years ago

Thank you, this was the issue. The data I was testing was from a project where this name hadn't followed this convention properly. I tested another project, and the data is sorting into projects correctly.

yarikoptic commented 2 years ago

FWIW, there is -l option to override locator (grr on me for such an odd name) which is deduced from that field.

jmatthews-rotman commented 1 year ago

I have another dataset that is being being put into an Unassigned folder, but the dicom fields look correct to me. From DicomBrowser: (0008,1030) | Study Description | Keep | Gilboa_Erik 175_Bird-Learn

I assume I could use the override flag, but this is pilot data, and I'm trying to figure out why it's not sorting automatically. I thought maybe it was the dashes in the StudyName, but the following data converted and sorted correctly: (0008,1030) | Study Description | Keep | Furlan_Mitsue 178_C-FIVE-Study

Are there other fields that could interfere with the locator?

jmatthews-rotman commented 1 year ago

This issue is persisting for this protocol. I've uploaded the scout dicom at the link below. Wondering if you have any guesses why heudiconv is assigning locator='Unassigned'.

https://drive.google.com/drive/folders/1s9P8CfZM0k8fnxo88RE3Gm9SGPmSrqcb?usp=sharing

yarikoptic commented 1 year ago
❯ dcmdump /tmp/locator-sample.dcm| grep Desc
(0008,1030) LO [Unassigned]                             #  10, 1 StudyDescription
(0008,103e) LO [anat-scout_ses-01]                      #  18, 1 SeriesDescription
(0040,0254) LO [Gilboa_Erik 175_Bird-Learn]             #  26, 1 PerformedProcedureStepDescription

❯ dcmdump /tmp/locator-sample.dcm| grep -e '\<Manuf' -e Sof
(0008,0070) LO [Siemens Healthineers]                   #  20, 1 Manufacturer
(0008,1090) LO [MAGNETOM Prisma Fit]                    #  20, 1 ManufacturerModelName
(0018,1020) LO [syngo MR XA30]                          #  14, 1 SoftwareVersions

so the StudyDescription which we use does have Unassigned, and the interesting information is within PerformedProcedureStepDescription . I wonder if this step: https://github.com/ReproNim/reproin/blob/master/docs/walkthrough-1.md#choose-the-desired-program looks the same or how is it different that we get it into PerformedProcedureStepDescription?

jmatthews-rotman commented 1 year ago

Ah! You've accidentally put me on to the source of my issue. I've been checking dicom headers directly from our backup hard drive at the scanner, but converting data after downloading it from our XNAT server. XNAT appear to be making exactly two changes to the dicom headers: adding a (0012,0064) field related to de-identification, and modifying (0008,1030). Just my luck!

For reference, Siemens seems to be consistent about (0040,0254) field containing the same as (0008,1030), but making heuristic changes based on that seems unwise when it my specific XNAT implementation causing my issue.

I will investigate my XNAT configuration to see if I can resolve this.

Alternatively, am I correct that my next best options are to either: -create a local heurisitc for my site branching the ReproIn heuristic and pulling the locator info from (0012,0064) or -manually specify a locator flag

yarikoptic commented 1 year ago

XNAT appear to be making exactly two changes to the dicom headers:

it is likely not just any XNAT server but some configuration/workflow? if it is used by someone else but you - we better add some documentation somewhere on this aspect!

-create a local heurisitc for my site branching the ReproIn heuristic and pulling the locator info from (0012,0064)

for that I am afraid I better finish https://github.com/nipy/heudiconv/pull/581 so you can actually get access to some arbitrary field in DICOM. Or do you see how to do that without this?

jmatthews-rotman commented 1 year ago

XNAT appear to be making exactly two changes to the dicom headers:

it is likely not just any XNAT server but some configuration/workflow? if it is used by someone else but you - we better add some documentation somewhere on this aspect!

We isolated this to XNAT's built in Anonymization scripts and settings. We weren't purposefully implementing any anonymization on XNAT since our procedures keep PHI out of the dicom headers in the first place, but it looks like there were some default settings in place (specifically "(0008,1030) := project") that we weren't aware of and were not working as intended. Once these scripts were removed, our Study Description field is working as intended. This appears to be something that should be resolved on XNAT and probably doesn't need consideration within HeuDiConv/ReproIn, but could be documented to be helpful for future XNAT users.

-create a local heurisitc for my site branching the ReproIn heuristic and pulling the locator info from (0012,0064)

for that I am afraid I better finish nipy/heudiconv#581 so you can actually get access to some arbitrary field in DICOM. Or do you see how to do that without this?

I had not put much thought into this, but it looks like you can keep putting this off for now!

yarikoptic commented 1 year ago

We weren't purposefully implementing any anonymization on XNAT since our procedures keep PHI out of the dicom headers in the first place, but it looks like there were some default settings in place (specifically "(0008,1030) := project") that we weren't aware of and were not working as intended. Once these scripts were removed, our Study Description field is working as intended. This appears to be something that should be resolved on XNAT

I am not yet fully following what XNAT component is to blame here to provide specific guidance. Could you please elaborate in here https://github.com/ReproNim/reproin/blob/master/README.md#xnat and close this issue with that commit/PR (add Closes #57 in PR description)?