Sydney-Informatics-Hub / xnat-uploader

Command-line tool for batch uploading of DICOMs to XNAT
3 stars 0 forks source link

Digest mismatches in new branch with scan IDs #98

Open spikelynch opened 1 year ago

spikelynch commented 1 year ago

These are being caused because when I set up the tests, I grabbed a bunch of random anonymised abdomen x-rays, and partitioned them into "neck" and "head" scans for one of the fake patients.

This worked OK when I was separating them out using the scan type "Neck CT" / "Head CT" - but now that I'm using SeriesNumber as the scan ID - which is the correct thing to do in order to get deidentification to work - all of the scans ("neck" and "head") have the same series number, and they're being uploaded to the same scan.

But as two pairs of scans have the same filenames, the second set of scans is overwriting two of the first, so the digest check is failing.

This isn't a bug in the code, it's because of bogus test data, the fix is to either hack the DCMs to have distinct series number values, or put them all in the same series.

spikelynch commented 1 year ago

Note: I also need to check the sample data for cases where this might happen.

If a patient has two sets of scans with overlapping file names which happen to have the same series number (with everything else being equal like date etc) then they'll also clobber one another.

Simplest fix would seem to be making sure that cases like this get two session labels and thus datasets - so incorporate the scan type into the session label along with the series number