gacou54 / pyorthanc

Python library that wrap the Orthanc REST API and facilitate the manipulation of data in Orthanc
MIT License
49 stars 10 forks source link

Processing "large" DICOM datasets - ReadTimeout: timed out error #32

Closed rjkowalski closed 1 year ago

rjkowalski commented 1 year ago

I am running the latest version of Orthanc (1.12.1) along with pyorthanc 1.11.5 on macOS Ventura 13.5.1 (Apply M2 Pro). I have loaded an fMRI DICOM dataset that contains 73365 DICOM instances and is ~900MB in size. When I try to do a resource modification (anonymization) via pyorthanc, the operation continues to yield a ReadTimeout: timed out error. I have tried setting orthanc.timeout = 20000 but the anonymization process seems to take a very long time and continues to time out. This is what I am using to anonymize:

anonymized_patient = Patient(id, orthanc).anonymize(keep=['PatientName'], replace={'PatientID': 'TheNewPatientID'}, remove=['ReferringPhysicianName'], force=True )

I would expect anonymization to take some time for this size dataset but I'm wondering if there is anything else I can do to have the process complete successfully (i.e. should I use the Async client in this case?). Any help would be greatly appreciated.

gacou54 commented 1 year ago

You raise a very good point! By default, pyorthanc waits for the job to be completed before returning the anonymized patient (which can cause a timeout, especially at the patient level).

In the Orthanc client, we can provide more options, notably the Asynchronous = True which returns a job rather than the new patient.

Right now I think you should be able to 1) start an anonymization job, 2) follow the job state, 3) make a new patient when the job is done.

import time

job_id = orthanc.post_patients_id_anonymize(patient.id_, {'Asynchronous': True})['ID']
while True:
     job_info = orthanc.get_job_id(job_id)
     # where state is 'Running', 'Success' or 'Failure'.
     if jon_info['State'] == 'Success':
         anonymous_patient = Patient(job_info['Content']['ID'], orthanc)
         break
     time.sleep(2)

I will change Patient.anonymize() for the next release to make this process easier in the future! I think we should have a Job object.

rjkowalski commented 1 year ago

Thanks for your feedback, @gacou54 . I'll give that a try. Would be great to have the ability to add in "KeepPrivateTags": true to the anonymize method as well.

gacou54 commented 1 year ago

@rjkowalski absolutely. Actually, the functionality is implemented in a PR https://github.com/gacou54/pyorthanc/pull/33/files#diff-440ee36434c062ba3c5ac50be35d925f3f0d029fd6a7b94e6b5798ffef0d4a3d.

I hope to release the next pyorthanc version very soon.

gacou54 commented 1 year ago

Now added in main as .anonymize_as_job(), will be in next pyorthanc release