pydicom / dicom-validator

Simple DICOM validator based on DocBook DICOM specs
MIT License
25 stars 11 forks source link

Condition parsing issue on 0.3.4 #20

Closed naterichman closed 1 year ago

naterichman commented 1 year ago

First of all, thanks for all the work you've put in here as well as over at main pydicom package.

I'm, using dicom-validator to both report on validation, but also given an SOPClassUID and dataset, determine if any required tags are missing by building a list of required tags.

I noticed this traceback:

  File "/usr/local/lib/python3.9/site-packages/dicom_validator/validator/iod_validator.py", line 157, in _object_is_required_or_allowed
    required = self._composite_object_is_required(condition)
  File "/usr/local/lib/python3.9/site-packages/dicom_validator/validator/iod_validator.py", line 172, in _composite_object_is_required
    required = self._object_is_required(condition)
  File "/usr/local/lib/python3.9/site-packages/dicom_validator/validator/iod_validator.py", line 197, in _object_is_required
    return self._tag_matches(tag_value, operator, condition['values'])
  File "/usr/local/lib/python3.9/site-packages/dicom_validator/validator/iod_validator.py", line 219, in _tag_matches
    values = [type(tag_value)(value) for value in values]
  File "/usr/local/lib/python3.9/site-packages/dicom_validator/validator/iod_validator.py", line 219, in <listcomp>
    values = [type(tag_value)(value) for value in values]
  File "/usr/local/lib/python3.9/site-packages/pydicom/valuerep.py", line 1125, in __new__
    newval = super().__new__(cls, float(val))
ValueError: could not convert string to float: 'overriding (specializing) the Type 1 requirement on this Attribute in the'

And when I look at the relevant section in the mod_info.json I see this condition seems to be getting parsed incorrectly

  "C.8.6.3": {
    ...
    "(0028,0009)": {
      "cond": {
        "index": 0,
        "op": ">",
        "tag": "(0028,0008)",
        "type": "MN",
        "values": [
          "1",
          "overriding (specializing) the Type 1 requirement on this Attribute in the"
        ]
      },
      "name": "Frame Increment Pointer",
      "type": "1C"
    },

Looks like this causes the problem later when trying to convert "overiding..." to the VR of (0028, 0008) (Number of Frames) to see if the criteria is matched.

I briefly looked to see if I could figure out where this parsing is happening in the Condition but there is a lot of logic in there that is a bit overwhelming, perhaps in the _valid_or_expressions_after_or_expression method here

The condition is coming from a section in part03 that looks like below, so it looks like it is getting confused on the comma perhaps?

                                    <para xml:id="para_0fa4ba5b-22d4-429b-b795-0bfd277fe866">Shall be present if Number of Frames is greater than 1, overriding (specializing) the Type 1 requirement on this Attribute in the <xref linkend="sect_C.7.6.6" xrefstyle="select: title"/>.</para>

Any help would be greatly appreciated!

mrbean-bremen commented 1 year ago

Thanks for the report! I have to admit that I have neglected this project for quite some time, but I will be happy to get back to it if somebody is actually using it. I'll probably have a closer look over the weekend.

naterichman commented 1 year ago

Sounds good and no rush, AFAIK this is the only package that actually parses the dicom standard and applies validation so it's awesome to have! Thanks again, and happy to contribute if you could point me to where the issue might be

mrbean-bremen commented 1 year ago

AFAIK this is the only package that actually parses the dicom standard and applies validatio

There is actually DVTk, which I found after writing this project. It does the validation based on definition files that are created from the DICOM standard, though I'm not sure if that is done automatically. I assume that they have some automatism to create an inital version, and adapt it manually, but that is only a guess (that part is not open source). They provide the definition files from 2010 in the repo, and I think you can buy versions with access to newer versions.

As this is backed by a company and not just a free time pet project like the DICOM validator, it is certainly in a better state.

mrbean-bremen commented 1 year ago

@naterichman - I just searched for that string ("overriding (specializing) ...") and found that it does not appear if using the main branch for parsing. Could you please check if the problem is fixed in the main branch (e.g. using pip install git+https://github.com/pydicom/dicom-validator)? If that is the case, I would make a new release, otherwise I will have a closer look later...

naterichman commented 1 year ago

@mrbean-bremen We tried updating to main and that does seem to fix the parsing of the original issue, however, I am getting another issue here when doing validation. The issue is that the rendered spec for C.7.6.17, DimensionIndexSequence has no values field:

  "C.7.6.17": {
    ...
    "(0020,9222)": {
      "cond": {
        "index": 0,
        "op": "-",
        "tag": "(0020,9311)",
        "type": "MU"
      },
      "items": {
        ...
        "(0020,9165)": {
          "name": "Dimension Index Pointer",
          "type": "1"
        },
        "(0020,9167)": {
          "cond": {
            "index": 0,
            "op": "=",
            "tag": "(0020,9165)",
            "type": "MN"
          },
          "name": "Functional Group Pointer",
          "type": "1C"
        }
        ...
      },
      "name": "Dimension Index Sequence",
      "type": "1C"
    }
    ...
  }

I tried myself and just changing to a default of an empty list seems to fix the issue

return self._tag_matches(tag_value, operator, condition.get('values', []))

However, I'm not sure that is the right solution, any ideas?

mrbean-bremen commented 1 year ago

Thank you - I will have a closer time later (probably tomorrow evening - I will need a bit of time to get back to the code), but your fix sounds reasonable. Also looks like something I should have seen... probably need to add some tests.

mrbean-bremen commented 1 year ago

Ok, the root cause seems to be that due to some parsing error some conditions that should have a value (conditions that check equality) do not have one. This concerns 40 conditions out of 567 for the current standard.

I have to either fix the conditions or remove them, if they are not parseable.

In your concrete case the condition for Functional Group Pointer reads:

Required if the value of Dimension Index Pointer (0020,9165) is the Data Element Tag of an Attribute that is contained within a Functional Group Sequence.

Obviously, this condition is not correctly parsed (this is a kind of condition that I currently don't support). It is incorrectly seen as a comparison for equality, and there is a check missing that a value is actually found.

In the Readme is already mentioned that condition evaluation may not work correctly, and some of the cases may fall in this category, but it is clearly a bug that this is not handled correctly.

I will see if I can fix some of the failing conditions, and ignore them in case they cannot be correctly parsed. This will take a bit of time, and I probably won't get to it before the weekend.

mrbean-bremen commented 1 year ago

@naterichman - can you please check with the main branch? If it works for you, I will make a new release.

naterichman commented 1 year ago

That looks like it is working thank you so much!

mrbean-bremen commented 1 year ago

As promised, I made a new patch release. And I'll probably get back to this project now - maybe I can fix a few more things...