Closed richard-viney closed 2 years ago
I haven't looked at DIMSE in a long time, but I would think that you would have a VM
of greater than one if you have multiple UIDs in a query, and that we should keep the UI
VR
strictly matching the standard. That is, I think the backslash would be parsed at a different level.
Yep the backslash separator isn't part of the DICOM standard itself, but dcmjs already supports it on other VR types, e.g. IS
(integer string) and DS
(decimal string), which is quite handy, so UID seems to be the odd one out here in not supporting it (there may also be others, I haven't done an exhaustive search).
I didn't write this part of the code originally, so I'm reverse engineering here too, but at least for the decimal string (DS
) case it appears that rather than treating the backslash as part of the token it splits the string and returns a list of numbers. Wouldn't you want the same behavior for UID
?
The splitting on backslash is already being done here:
So simply letting the backslash through the UniqueIdentifier
parsing like this PR does is sufficient. For the decimal and integer string cases it's different because they need to convert to an actual number, or array of numbers, so the splitting is done in the ValueRepresentation
subclass.
I'm still not convinced we should treat a string with backslashes in it as a valid UID VR. Can you provide more information and a test or example use case in dicom-dimse where this is required?
Yep, certainly. I've compared to pydicom and it works in a similar way, i.e. it treats backslashes as UID delimiters in any DICOM content it parses. This isn't strictly correct, because it should really check against the VM (value multiplicity) of the tag it's parsing. In a standard DICOM file UID fields generally have a multiplicity of 1 (though a private field could have any multiplicity), and in the case of DIMSE the UID fields allow any amount of multiplicity. Doing this additional check against the tag's multiplicity would be a much larger change to dcmjs, and the only benefit is preventing multiplicity in those contexts where it isn't technically meant to be allowed. pydicom doesn't seem to have bothered with this, I think justifiably so, and just allows multiplicity anywhere, which is the approach dcmjs has also taken on DS and IS VRs, it just wasn't implemented for UID VRs.
The relevant part of the specs are:
https://dicom.nema.org/medical/dicom/current/output/chtml/part04/sect_C.2.2.2.2.html https://dicom.nema.org/medical/dicom/current/output/chtml/part05/sect_6.4.html
I've updated the PR with these links to the spec, and added a test for reading a DICOM that has multiple values for UID and DS tags.
Thoughts?
Sure thing, the binary file has been removed and put here https://drive.google.com/file/d/180YkJ6cqHKm0RphxMGCeUR0aFsUNDL1u/view?usp=sharing
When using this library with C-FIND requests via
dcmjs-dimse
, it is necessary to allow backslashes in UIDs because the spec says it is possible to query a list of UIDs by using a backslash as a delimiter.Without this change UID list queries aren't possible in dcmjs-dimse because the backslashes get stripped out.