google / fhir-data-pipes

A collection of tools for extracting FHIR resources and analytics services on top of that data.
https://google.github.io/fhir-data-pipes/
Apache License 2.0
141 stars 80 forks source link

Keep the resource IDs consistent when the objects are converted from HAPI->Avro->HAPI. #1003

Open chandrashekar-s opened 3 months ago

chandrashekar-s commented 3 months ago

To make the resource IDs consistent across systems only the ID part from the resource IDs are being fetched i.e. for id = http://localhost:9021/openmrs/ws/fhir2/R3/Person/bee471c4-7e08-4a31-b9d8-a0c0bd2ab103 only the id part id = bee471c4-7e08-4a31-b9d8-a0c0bd2ab103 is fetched. This change has been made early to fix this issue.

However, when resources are loaded from JSON into HAPI objects using the IParser class. The Ids are usually created with the pattern <ResourceType>/<ID> and when this HAPI object is converted to Avro records and back to HAPI objects again only the ID part gets retained in the final reconverted HAPI object. This needs some consistency.

bashir2 commented 2 months ago

An update on this issue: In PR #1026 I tried to address this issue by adding a fullId option to keep the full ID when creating Avro records. This turned out to be a bad idea because for example /history/ is also part of the "full ID"; also inclusion of base URLs is not a good idea because depending on how we are reading the input resources, it may or may not be present (e.g., reading from Search API vs. JDBC). For these reasons I have reverted the fullId change.

One way to address this particular issue is to add only resource type when we are converting Avro records back to HAPI (but exclude everything else, e.g., base-URL and history). We should also note that per FHIR documentation the id element's regex is [A-Za-z0-9\-\.]{1,64} so in a pure id we cannot really have a /.

Whatever we choose as the solution for this issue, we should make sure that JOIN queries on references work without any string manipulation, e.g., Observation.subject.patientId matches Patient.id.