khanlab / cfmm2tar

Tools for dicom-server retrieval, tarballing and conversion to bids
GNU General Public License v3.0
2 stars 3 forks source link

XML file attributes #12

Open kaitj opened 7 months ago

kaitj commented 7 months ago

Noticed that DICOM querying and cfmm2tar was no longer working on the autobids portal. It seemed that some of the attributes that were being queried are no longer being found in the XML file grabbed via findscu from the DICOM server. These are all of the attributes that were being queried:

ATTRIBUTES_QUERIED = [
    "0020000D",  # StudyInstanceUID
    "00100010",  # PatientName
    "0008103E",  # SeriesDescription
    "00200011",  # SeriesNumber
    "00200010",  # StudyID
    "00100020",  # PatientID
    "00100040",  # PatientSex
]

@isolovey - would you know if anything may have changed?

isolovey commented 7 months ago

@kaitj Can you paste a specific call to findscu which resulted in an error, or StudyInstanceUID attribute of the study in question, so I can reproduce it? Everything should be working as before on the DICOM server end.

kaitj commented 7 months ago

I can grab the specific call later today (no access to the autobids vm from current ip), but the example StudyInstanceUID is 1.3.12.2.1107.5.2.43.67007.3000002312141325180190000000, though it was happening for any study I had tried.

isolovey commented 7 months ago

I cannot find a study with StudyInstanceUID=1.3.12.2.1107.5.2.43.67007.3000002312141325180190000000 in our database. The ID indicates that it's from December 14, 2023, acquired around 1:25 pm ("2312141325"). All the scans scheduled for that day are accounted for on the DICOM server (with different Study Instance UIDs), however. Can you give me more details about this study, or any other that fails? Date and study description (i.e. principal/project), in addition to Study Instance UID, should be sufficient.

kaitj commented 7 months ago

Believe for this one it was Hayden^Covid-FollowUp on 20231214, but it's been happening for all studies (as far as I am aware) that the bidsdump account has access to. I'll take a look for more specific examples once I have access to the VM for the other examples I've looked at and reply to this thread (along with the findscu call).

I can see the scans if I log onto the dicom browser as bidsdump, just seems to be something with the XML that gets grabbed with findscu, which autobids is using to filter the available data before running any cfmm2tar processing.

kaitj commented 7 months ago

@isolovey - this was the command that was used. It's the same one that gets called by autobids, just with a different --out-dir and the username/password scrubbed.

findscu --bind DEFAULT --connect CFMM@dicom.cfmm.uwo.ca:11112 --accept-timeout 10000 --tls-aes --user username --user-pass password -m StudyDescription=Hayden^COVID-FollowUp --out-dir /tmp/test --out-file 000.xml -X
isolovey commented 7 months ago

The findscu call works for me using with the bidsdump user. It produces one xml file per study with whatever attributes from ATTRIBUTES_QUERIED make sense at study level (implicit -L STUDY). i.e. StudyDescription, StudyInstanceUID, StudyID, PatientName, PatientID, PatientSex.

SeriesDescription and SeriesNumber are empty at this level because it's not a series-level query.

If you modify the query to make it series-level (add -L SERIES to the findscu command), it will complain that you're not providing a StudyInstanceUID. This is because by default a series-level C-FIND is a look-up of series of a particular study (which you identify unambiguously by its StudyInstanceUID).

You can request a query to be "relational" (i.e. more akin to a DB query) by adding it to the association Extended Negotiation, which is the --relational option in findscu. If you do that (i.e. append -L SERIES --relational to your query), you'll get an XML file for every study and series matching the query, i.e. 803 files for -m "StudyDescription=Hayden^COVID-FollowUp".

I've searched for where ATTRIBUTES_QUERIED is used and found it in autobids-newstudy-app. Both uses of it are for series-level queries, but one of those doesn't seem to include StudyInstanceUID in the query, and does not negotiate for a relational query. So if that one fails, that makes sense. It's possible it worked before due to a bug in the DICOM server software.

isolovey commented 7 months ago

It looks like autobids-newstudy-app uses a wildcard query, i.e. -m StudyInstanceUID=*. This has been disabled by default by a recent change to our DICOM server application, dcm4che/dcm4chee-arc-light#4252. It can be re-enabled server-wide by a config setting, but it's much better to either not rely on such a wildcard query, or to explicitly specify --relational in the findscu call.

The way I'd do it if I had to get e.g. all Series Descriptions of all studies in Hayden^COVID-FollowUp, is I'd query at the STUDY level first, get a list of StudyInstanceUIDs and then query at the SERIES level. That's assuming you actually need to collect series-level attributes, which is not evident from how that function is being used. If instead you only need STUDY-level attributes, then the level needs to be changed to STUDY. I may be misunderstanding how autobids-newstudy-app is looking up studies though.

kaitj commented 7 months ago

Thanks! The recent changes + possible bug makes sense for why this might have suddenly stopped working. Re: wildcard query, we do provide users with option to set that query in the config, but I find that is rarely used. For now I've added --relational to the findscu call and that seems to have gotten it working again.

Will have to look into the two-stage querying like you've suggested!