imi-bigpicture / wsidicom

Python package for reading DICOM WSI file sets.
Apache License 2.0
30 stars 5 forks source link

Catch and fix Google Healthcare API errors #149

Closed psavery closed 5 months ago

psavery commented 5 months ago

This change adds support for Google Healthcare API DICOMweb servers, such as the NCI's Imaging Data Commons.

The problem: Google Healthcare API raises an error if AvailableTransferSyntaxUID is a field, or if SOPClassUID is used as a search filter.

The SOPClassUID should definitely be allowed as an instance-level search filter, as documented in Table 10.6.1-5. Required Matching Attributes. However, this has apparently been a long-standing problem of nearly four years (see here), so it may not be fixed anytime soon. And even if it is fixed, the Imaging Data Commons may not update their software anytime soon. It would be highly advantageous to support such a large DICOMweb repository by working around the issue.

The fix in this PR is as follows:

  1. The two search_for_instances() calls are still performed identically as before, as long as there are no HTTP errors.
  2. If there is an HTTP error with a 400 status_code, and a message is present matching the errors from Google Healthcare API, then the search_for_instances() arguments are patched to work for Google Healthcare API, as follows: a) AvailableTransferSyntaxUID is simply removed, if present. b) SOPClassUID is manually filtered, if present (meaning it is not supplied in the search_filters, but only instances with a matching SOPClassUID are returned).

These changes shouldn't have any impact on any situations except where an error occurs from a Google Healthcare API server. And in that case, the function calls are patched and then work properly.

The following example works after this fix:

from wsidicom import WsiDicom, WsiDicomWebClient

url = 'https://proxy.imaging.datacommons.cancer.gov/current/viewer-only-no-downloads-see-tinyurl-dot-com-slash-3j3d9jyp/dicomWeb'
study_uid = '2.25.227261840503961430496812955999336758586'
series_uid = '1.3.6.1.4.1.5962.99.1.1334438926.1589741711.1637717011470.2.0'

client = WsiDicomWebClient.create_client(url)

slide = WsiDicom.open_web(client, study_uid, series_uid)

Fixes: #141

erikogabrielsson commented 5 months ago

Hi @psavery

Thanks for your pull request. When parsing dicom files we have tried to handle as many implementations error as possible, so of coarse we should strive to do the same when reading from DICOM web.

Your approach inspired me to make some changes that would enables us to (hopefully) re-use the _search_for_instances() method also for other implementation errors. Is it ok if I push to your branch?

psavery commented 5 months ago

@erikogabrielsson Sure, feel free to push to it! :slightly_smiling_face:

We would love to have these changes in soon, so that we can start utilizing wsidicom with that large database!

psavery commented 5 months ago

Another possibility would be to merge these changes, and add new features as a separate PR.

psavery commented 5 months ago

Thanks so much, @erikogabrielsson! Can we get a new release so we can start using these features?

erikogabrielsson commented 5 months ago

Thanks so much, @erikogabrielsson! Can we get a new release so we can start using these features?

Released in 0.19.0