LinuxForHealth / FHIR

The LinuxForHealth FHIR® Server and related projects
https://linuxforhealth.github.io/FHIR
Apache License 2.0
330 stars 157 forks source link

Resources Device, Organization, Practitioner not included in BulkDataExport (Group Export) #3247

Open sagarsarvankar opened 2 years ago

sagarsarvankar commented 2 years ago

Is your feature request related to a problem? Please describe. When doing a Group export, the resources not included in Patient compartment are not exported such as the below ones: Device Organization Practitioner etc.

We would like to have these resources exported, as this is referenced by other resources.

Describe the solution you'd like The default behavior could be as is, but anyone interested in more resources apart from the default ones, should be allowed to configure more resources and they can then be part of the group export

Describe alternatives you've considered

Acceptance Criteria

  1. GIVEN [a precondition] AND [another precondition] WHEN [test step] AND [test step] THEN [verification step] AND [verification step]

Additional context Some discussion happened at https://chat.fhir.org/#narrow/stream/212434-ibm/topic/Bulk.20Export

prb112 commented 2 years ago

Thanks @sagarsarvankar we'll look into it.

sagarsarvankar commented 2 years ago

Hello @prb112 , do you think this will be considered sometime soon?

lmsurpre commented 2 years ago

I've taken a closer look at this one and unfortunately its quite tricky. The current Patient/Group export works something like this:

  1. fhir-operation-bulkdata takes the incoming request, computes the set of resource types to include for the export, and submits an export job
  2. fhir-bulkdata-webapp receives the job and the liberty batch framework calls PatientExportPartitionMapper
  3. for each resource type in the job that has at least 1 resource instance, this mapper will create a separate "partition"
  4. each job "partition" gathers the list of patient resource ids (via 'search' for Patient/$export and via iterating the group members from Group export), then performs a 'compartment search' for resource types associated with at least one of those ids.

Because each partition is associated with a single resource type, it has just a single output stream to a target/sink for its little slice of the export. While it would be possible to add "_include" parameters to these searches and pull back the corresponding resource types, it would require the implementation to then branch on the result entries and write each one to a resource-type-specific target/sink.

If lots of resource types reference the same target resources, the disparate partitions would need to somehow coordinate with one another to avoid writing duplicates and to minimize the overall number of export files (so you don't end up with a bunch of export files with just a few resource instance rows).

Net: to do this properly, it would be a huge undertaking and likely a redesign of the feature. Not something we can sign up for right now. If you can think of a better way to do it, do make a proposal.

lmsurpre commented 2 years ago

As a workaround, you can use our system-level export to export all instances of a given resource type and then perform client-side processing.

sagarsarvankar commented 2 years ago

Hello @lmsurpre ,

Thank you for the detailed explanation. Yes, it does sound like a redesign of the feature.

But as part of trying out something that could work, we did code changes as highlighted in below screenshot image

But, this alone does not help. After checking the logs, it was returning zero records as seen in the below screenshot and hence no records exported for this resource type. image

After carefully reviewing the query, we inserted required row in table 'Location_RESOURCE_TOKEN_REFS' for the patient. After doing this, we ran group/$export and it exported proper records.

I have a question here, why was the record not inserted into 'Location_RESOURCE_TOKEN_REFS' and similar tables like for device, practitioner, organization, medication during data creation whereas, they were inserted into all other Patient compartment tables (*_RESOURCE_TOKEN_REFS)?

Please let me know your thoughts.

Thank you, Sagar

lmsurpre commented 2 years ago

why was the record not inserted into 'Location_RESOURCE_TOKEN_REFS' and similar tables like for device, practitioner, organization, medication during data creation whereas, they were inserted into all other Patient compartment tables (*_RESOURCE_TOKEN_REFS)

Because Location, Device, Practitioner, Organization, and Medication aren't members of the patient compartment. We do support customizing this compartment definition in the registry. But, unfortunately, there's no way to "reverse include" resource instances at this time (i.e. logic like "Location is in the patient compartment if its referenced from some other resource type in the compartment"). That would be similar to #2371 but even more complicated because the references are going the "wrong way". If you could ensure that the Location resource always exists when the resource that references it is created, then you could possibly add that row when that other resource is indexed...feel free to play with it.

sagarsarvankar commented 2 years ago

Ok @lmsurpre , we will go through the source some more and will get back to you if we have any questions.