NIH-NCPI / ncpi-model-forge

🔥 The Project Forge FHIR model
Apache License 2.0
4 stars 5 forks source link

Supporting Bulk Export Operation #51

Closed liberaliscomputing closed 3 years ago

liberaliscomputing commented 3 years ago

The FHIR Bulk Data Access protocol describes a workflow for requesting bulk records in an interoperable format. Research/enable this functionality in the NCPI servers and document the querying process in detail.

liberaliscomputing commented 3 years ago

1. Supported Requests

The FHIR specification describes bulk export should be available at the following endpoints:

However, the Smile CDR server only supports the system level bulk export as of the latest release (2020.11.R01). I enabled this functionality in the NCPI controlled and development servers.

2. Requesting Bulk Export

Requesting a bulk data extract can be achieved via either a GET or POST request. This is a multi-step procedure using Asynchronous Request Pattern. At least, three consecutive requests need to be sent to finally retrieve records. Below describes the user flow when using GET requests.

2.1 Initiating a Bulk Extract

To send a GET request to kick off the extract, headers as well as query parameters should be properly set.

Once prepared, send a formed GET request to [fhir base]/$export server as follows:

curl -X GET \
    -H "Accept: application/fhir+json" \
    -H "Prefer: respond-async" \
    -H "Cookie: AWSELBAuthSessionCookie-0=<your-cookie>" \
    --data "_outputFormat=application/ndjson&_type=Patient"
    https://ncpi-api-fhir-service-dev.kidsfirstdrc.org/$export

The above request initiates bulk extraction of Patient resources.

2.2 Retrieving a Job Status

Once the bulk export request is successfully submitted, the server will return a response similar to the following:

HTTP/1.1 202 Accepted
Content-Location: https://ncpi-api-fhir-service-dev.kidsfirstdrc.org/$export-poll-status?_jobId=0000000-1111111-2222222

The status of 202 indicates that the bulk operation has been successfully kicked off as a background job that will assemble the Bulk Export payload. The Content-Location header indicates a URL that updates the status of the background job. Grab the content location URL and send a succeeding GET request with your cookie as follows:

curl -X GET \
    -H "Accept: application/fhir+json" \
    -H "Cookie: AWSELBAuthSessionCookie-0=<your-cookie>" \
    https://ncpi-api-fhir-service-dev.kidsfirstdrc.org/$export-poll-status?_jobId=0000000-1111111-2222222

While the job is still in progress, the server will return a response similar to the following :

HTTP/1.1 202 Accepted
X-Progress: Build in progress - Status set to BUILDING at 2020-12-02T20:04:06.664+00:00 Retry-After: 120

When the server has completed the extract, the same polling request will return a response similar to the following where the response payload is a list of files that were assembled by the Binary export job:

{
  "transactionTime": "2020-12-02T20:04:06.664+00:00",
  "request": "/$export?_outputFormat=application%2Ffhir%2Bndjson&_type=Patient",
  "output": [
    {
      "type": "Patient",
      "url": "https://ncpi-api-fhir-service-dev.kidsfirstdrc.org/Binary/516453"
    },
    {
      "type": "Patient",
      "url": "https://ncpi-api-fhir-service-dev.kidsfirstdrc.org/Binary/516454"
    },
    {
      "type": "Patient",
      "url": "https://ncpi-api-fhir-service-dev.kidsfirstdrc.org/Binary/516455"
    },
    {
      "type": "Patient",
      "url": "https://ncpi-api-fhir-service-dev.kidsfirstdrc.org/Binary/516456"
    },
    {
      "type": "Patient",
      "url": "https://ncpi-api-fhir-service-dev.kidsfirstdrc.org/Binary/516457"
    }
  ]
}

2.3 Retriving Records

Given the above response body, send a GET request(s) to finally retrieve bulk records.