cerner / bunsen

Explore, transform, and analyze FHIR data with Apache Spark
https://engineering.cerner.com/bunsen
Apache License 2.0
114 stars 49 forks source link

Support FHIR profiles to use only used elements #18

Open boristyukin opened 6 years ago

boristyukin commented 6 years ago

right now Bunsen creates encoder / schema for the entire set of resource elements. Realistically though FHIR servers would return a limited set of vendor specific elements and they would commonly be described in FHIR profile. I think it would simplify greatly schemas in Spark and reduce confusion for users since dataframes will contain only actual elements, support by a vendor.

rbrush commented 6 years ago

I think we can do this pretty easily but will need to take a closer look at how we can pull in profiles. All of our schemas and encoders are generated from the RuntimeResourceDefinition in the HAPI library. [1] So, if there is a way to get a RuntimeResourceDefinition for a given profile, we should be able to drop it right in and only the appropriate fields would be generated. I haven't worked enough with HAPI and specific profiles to have a good idea of how to put that together, but I'd presume there is a good way to do so.

[1] https://github.com/jamesagnew/hapi-fhir/blob/master/hapi-fhir-base/src/main/java/ca/uhn/fhir/context/RuntimeResourceDefinition.java