GoogleCloudPlatform / healthcare-dicom-dicomweb-adapter

Adapter which transforms DIMSE requests to DICOMweb requests
Apache License 2.0
122 stars 48 forks source link

Issue with query/retrieve status #54

Closed cdoshi closed 4 years ago

cdoshi commented 5 years ago

Hi,

I am executing the following command movescu -v -k 0008,0052=STUDY -k AccessionNumber=xxx -aet IMPORTADAPTER -aec SCP xxx.xxx.xx.xx 104 from the terminal

If I run it once, it throws the following error Move response with error status (Refused: OutOfResourcesSubOperations) but if I check the data store the images have been moved and are available. If I run the same command again, it runs successfully.

Now if I delete the study from the store and run the command again, I again get the OutOfResources error. I see the same behavior with pynetdicom send_c_move as well.

Does anyone know why this is happening?

red1408 commented 5 years ago

Hi, need a few more details.

Is error status 0xA700 (OutOfResources) or 0xA702 (UnableToPerformSubOperations) ? "OutOfResourcesSubOperations" falls in between dcm4che terminology I'm more familiar with.

What do you run the adapter against (healthcare api or something else)?

cdoshi commented 5 years ago

Hi @red1408, I am getting a 0xA702 error. I am currently using the terminal to perform cmove between the PACS and the adapter via the load balancer.

red1408 commented 5 years ago

0xA702 happens when all C-STORE sub operations within C-MOVE fail, could you check adapter's log output for error messages starting with "Failed CStore within CMove" ?

cdoshi commented 5 years ago

Hi @red1408, I am trying to enable stack driver monitoring but I am unable to do so. Here is my dicom_adapter.yaml file. Any ideas? I have deployed the adapter on kubernetes engine on google. Where exactly are the events logged?

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: xxx-dicom-adapter
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: xxx-dicom-adapter
    spec:
      containers:
        - name: xxx-import-adapter
          image: gcr.io/cloud-healthcare-containers/healthcare-api-dicom-dicomweb-adapter-import:0.1.12
          env:
          - name: ENV_POD_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: ENV_POD_NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace
          - name: ENV_CONTAINER_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          ports:
            - containerPort: xxxx
              protocol: TCP
              name: "port"
          command:
            - "/import/bin/import"
            - "--dimse_aet=IMPORTADAPTER"
            - "--dimse_port=2575"
            - "--dicomweb_address=xxxx"
            - "--dimse_cmove_aet=xx"
            - "--monitoring_project_id=xxxx"
red1408 commented 5 years ago

Stackdriver monitoring collects only aggregate events (bytes moved by c-store, c-move, amounts of calls to services, amounts of errors, etc). Yaml looks good to me, you can check whether monitoring initializes correctly in logs as well (adapter logs this info at startup).

Actual logs go to to Stackdriver Logging, to view them find view 'container logs' from your kubernetes workload details in cloud console.

cdoshi commented 5 years ago

Hey @red1408, I checked the container logs but I am not able to view anything that is relatable to cmove, cstore etc. but I do see some error messages limits.cpu needs updating. Is: '1', want: '1000m'. and Resources are not within the expected limits

I updated my dicom_adapter.yaml to add the following

resources:
      requests:
            cpu: "500m"
       limits:
            cpu: "500m"

but is still not working

red1408 commented 5 years ago

Stackdriver Logging does contain some extra logs that come from VM/GKE rather than adapter itself. These are mostly grouped before adapter's start.

'resources' section is not adapter-specific and also optional. Well, setting requests/limtis to "500m" should be a valid, but that's about all I can say.

cdoshi commented 5 years ago

I will take another look at the logs. In the mean time, I created the cluster based on the documentation provided at https://cloud.google.com/healthcare/docs/how-tos/mllp-adapter#deploying_the_mllp_adapter_to_google_kubernetes_engine which was meant for Hl7. Do you have the command you used to create the cluster itself that runs the dicom-adapter? Maybe there is some setting I am missing when creating the cluster itself.

cdoshi commented 5 years ago

Hey @red1408, I upgraded my cluster from n1-standard-1 to n1-standard-2 machines which essentially doubled my CPU and memory but I still get the 0xA702. I have essentially followed all the steps provided for mllp_adapter, just replacing from mllp_adapter with dicom_adapter. Is there anything else that I might have missed or is specific for setting up dicom adapter. So the weird thing is that the images are indeed transferred but the status seems out of sync with that. It shows the correct status the second time I execute the same command. Basically, as long as images are present in dicom stores, right status is returned. Sorry for the trouble but I am really stuck here

red1408 commented 5 years ago

I don't think 0xA702 is a matter of cpu/memory resources, despite how 'OutOfResourcesSubOperations' may sound. 0xA702 just means that all c-store sub operations within c-move failed (which may have happened for multitude of reasons), DIMSE protocol does not describe a way to pass more detailed error info to calling client. So you have to look in adapter logs for details.

The only thing I can think of would be that somehow c-store fails after all data was sent (i.e. http2 data stream not closing properly).

cdoshi commented 5 years ago

Hey @red1408, Thanks for pointing out that the error might not be what is says Although, I cannot point to the exact log, on stackdriver I do see a custom.googleapis.com/dicomadapter/import/cstore_errors under the stackdriver -> resources -> metrics explorer Does that help?

red1408 commented 5 years ago

Well, this does mean that both standalone c-store (this event) and c-store within c-move (has other event) experience errors. But details are only present in adapter log.

cdoshi commented 5 years ago

Hi @red1408, I visited the container logs of the workload but I do not see anything even remotely connected to cmove and cstore. Do you have an example of what you see at your end? Did you configure anything differently to make it work?

cdoshi commented 5 years ago

Hi @red1408, I finally managed to print out the adapter logs by adding a verbose flag to the dicom_adapter.yaml. I see a Explicit VR Little Endian failed with http status code 409 at com.google.cloud.healthcare.imaging.dicomadapter.CStoreService.store(CStoreService.java:84)

red1408 commented 5 years ago

Right, I always use verbose flag, so I forgot it's optional. 409 can happen when c-move/c-store destination already has an instance with same study/series/instance IDs, but some tags differ. Adapter adds a few tags during upload (ImplementationClassUID and ImplementationVersionName, if I remember right). So, for example, uploading same original file through adapter and directly via Postman request to healthcare api (or any other means) can cause this issue.

cdoshi commented 5 years ago

@red1408 But the behavior seems opposite, if my google data store does not have that study/image it throws a 409 but if the google store has that study/image, it works.

red1408 commented 5 years ago

As i remember 409s do contain quite a bit of extra information in response json, but only part of it gets passed up the stack (what fits in ErrorComment tag). I'll add full logging for error responses to make such cases easier to debug.

cdoshi commented 5 years ago

@red1408 This is the stacktrace for the error

textPayload: "org.dcm4che3.net.service.DicomServiceException: Http_409( status = 272 )
    at com.google.cloud.healthcare.imaging.dicomadapter.CStoreService.store(CStoreService.java:84)
    at org.dcm4che3.net.service.BasicCStoreSCP.onDimseRQ(BasicCStoreSCP.java:72)
    at org.dcm4che3.net.service.DicomServiceRegistry.onDimseRQ(DicomServiceRegistry.java:86)
    at org.dcm4che3.net.ApplicationEntity.onDimseRQ(ApplicationEntity.java:485)
    at org.dcm4che3.net.Association.onDimseRQ(Association.java:650)
    at org.dcm4che3.net.PDUDecoder.decodeDIMSE(PDUDecoder.java:459)
    at org.dcm4che3.net.Association.handlePDataTF(Association.java:634)
    at org.dcm4che3.net.State$4.onPDataTF(State.java:103)
    at org.dcm4che3.net.Association.onPDataTF(Association.java:630)
    at org.dcm4che3.net.PDUDecoder.nextPDU(PDUDecoder.java:177)
    at org.dcm4che3.net.Association$2.run(Association.java:478)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: com.google.cloud.healthcare.IDicomWebClient$DicomWebException: Http_409
    at com.google.cloud.healthcare.DicomWebClientJetty.stowRs(DicomWebClientJetty.java:145)
    at com.google.cloud.healthcare.imaging.dicomadapter.CStoreService.store(CStoreService.java:75)
    ... 13 more
red1408 commented 5 years ago

Yes, there is actually more info in raw 409 response, but it's not currently logged (which I plan to amend later). The only way you could check it right now is by debugging the adapter locally.

cdoshi commented 5 years ago

Hey @red1408, Thank you so much for helping me out. I basically logged detailed 409 response and the error message was very well explained "Value": ["Pubsub topic (projects/xxx/topics/xxx-dicom-uploaded) is not found in datastore"], I had added a pubsub topic when creating the dicom store but had forgotten to actually create that topic. Thank you again!!