hapifhir / hapi-fhir-jpaserver-starter

Apache License 2.0
379 stars 1.02k forks source link

Highly-parallel conditional creates cause duplicate resources to be created #459

Open chgl opened 1 year ago

chgl commented 1 year ago

If multiple clients are sending conditional-create (or update) transactions in parallel, instead of just one succeeding they all do causing duplicate resources to be created.

I was initially investigating the multi-replica behavior of the HAPI FHIR server (once more, see https://github.com/hapifhir/hapi-fhir-jpaserver-starter/issues/48) but noticed this happening on single-instance setups as well. To reproduce in a local Kubernetes cluster:

  1. Setup HAPI FHIR Server, running as a single instance with an included PostgreSQL database:
kind create cluster # https://kind.sigs.k8s.io/docs/user/quick-start/

helm repo add hapifhir https://hapifhir.github.io/hapi-fhir-jpaserver-starter/

helm upgrade --install -n hapi --create-namespace --wait --set postgresql.auth.postgresPassword=fhir --set replicaCount=1  hapi-fhir-jpaserver hapifhir/hapi-fhir-jpaserver
  1. Launch three pods, all of which try to send the same FHIR transaction bundle at the same time. Save the following as deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
  name: fhir-sender
spec:
  replicas: 3
  selector:
    matchLabels:
      app.kubernetes.io/name: fhir-sender
  template:
    metadata:
      labels:
        app.kubernetes.io/name: fhir-sender
    spec:
      containers:
        - name: fhir-sender
          image: docker.io/curlimages/curl:7.86.0@sha256:cfdeba7f88bb85f6c87f2ec9135115b523a1c24943976a61fbf59c4f2eafd78e
          command: ["/bin/sh", "-c"]
          args:
            - |
              curl --fail-with-body \
                   --retry 5 \
                   --retry-connrefused \
                   --request POST \
                   --header 'Content-Type: application/fhir+json' \
                   --data "@/tmp/bundle.json" "${FHIR_SERVER_BASE_URL}"
          env:
            - name: FHIR_SERVER_BASE_URL
              value: "http://hapi-fhir-jpaserver:8080/fhir"
          volumeMounts:
            - name: fhir-bundle
              mountPath: /tmp
              readOnly: true
          resources:
            requests:
              cpu: 1000m
              memory: 64Mi
            limits:
              memory: 64Mi
              cpu: 1000m
      volumes:
        - name: fhir-bundle
          configMap:
            name: fhir-bundle
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: fhir-bundle
data:
  bundle.json: |
    {
      "resourceType": "Bundle",
      "type": "transaction",
      "entry": [
        {
          "fullUrl": "urn:uuid:9d30422c-dfa2-4f43-bf38-d01ad5a5ded8",
          "resource": {
            "resourceType": "Patient",
            "identifier": [
              {
                "system": "https://fhir.example.com/identifier/patient-id",
                "value": "3"
              }
            ],
            "name": [
              {
                "family": "Jones311",
                "given": ["Bruce168"]
              }
            ],
            "gender": "male",
            "birthDate": "1987-04-25"
          },
          "request": {
            "method": "POST",
            "url": "Patient",
            "ifNoneExist": "Patient?identifier=https%3A%2F%2Ffhir.example.com%2Fidentifier%2Fpatient-id%7C3"
          }
        }
      ]
    }

And finally run:

kubectl apply -f deployment.yaml -n hapi

If the pods ran at least once, the issue should surface. To confirm, access the FHIR server by port-forwarding:

kubectl port-forward -n hapi deployment/hapi-fhir-jpaserver 8080:8080

and checking the number of Patient resources:

curl http://localhost:8080/fhir/Patient?_summary=count
{
  "resourceType": "Bundle",
  "id": "f956c1e0-56e8-4521-9347-ae1a73769511",
  "meta": {
    "lastUpdated": "2022-11-22T20:39:56.619+00:00",
    "tag": [ {
      "system": "http://terminology.hl7.org/CodeSystem/v3-ObservationValue",
      "code": "SUBSETTED",
      "display": "Resource encoded in summary mode"
    } ]
  },
  "type": "searchset",
  "total": 3
}

The total is 3, however when using conditional-creates I would have expected it to just be 1.

Note that URL-decoding the ifNoneExist URL doesn't seem to make a difference.

Interestingly, I can't seem to reproduce the same behavior on HAPI FHIR v5.4.1.

XcrigX commented 1 year ago

I was thinking of something like this to solve a problem with concurrent PUTs of the same resource: https://github.com/hapifhir/hapi-fhir/issues/4690

A similar scheme could be used for conditional creates by locking on the ifNoneExists string rather than the resource ID. It would not work in all cases, but would when the clients are creating equal ifNoneExists strings for the same resource.

chgl commented 1 year ago

Thanks for sharing! I believe that would only address one Hapi fhir process with multiple threads, so no synchronisation across network boundaries unfortunately.