w3c / hcls-fhir-rdf

Sketching out an RDF representation for FHIR
38 stars 15 forks source link

Add an explicit relationship between FHIR properties and their extensions #84

Closed dbooth-boston closed 1 year ago

dbooth-boston commented 4 years ago

A FHIR JSON extension like this:

 "active": true,
 "_active": {
   "extension" : [ {
      "url" : "http://example.org/fhir/boolean/Certainty",
      "valueDecimal" : 0.75
   }]
 }

will come through as RDF like this:

[] fhir:active   true ;
   fhir:_active  [
     fhir:extension [
       fhir:index 0 ;       # To retain ordering of multiple extensions
       fhir:url  "http://example.org/fhir/boolean/Certainty" ;
         fhir:valueDecimal  0.75
     ]
   ] .

For RDF processing it would be helpful if an explicit relationship were expressed between fhir:active and fhir:_active triples, so that applications do not have to parse apart URIs looking for similar URIs that differ only by a preceding underscore in the last part. It should be easy in SPARQL (for example) to look for any extensions to any properties.

Should our JSON pre-processor add such a relationship? If so, what should it be? One obvious possibility might be to declare:

fhir:active fhir:extensionProperty fhir:_active .

Using that, a SPARQL query to find all extensions on property ?p might be something like this:

SELECT ?node ?p ?pExtension WHERE {
  ?p fhir:extensionProperty ?pExtension .
  ?node ?p ?value ;
        ?pExtension ?extensionValue .

I don't know how efficient this will be to process though, because ?pExtension is a variable, which may require table scans instead of indexing. In the past I have used nested SELECT statements to force variables to be bound, to help performance.

SELECT ?node ?p ?pExtension WHERE {
  { SELECT ?p ?pExtension WHERE {
      ?p fhir:extensionProperty ?pExtension . }}
  ?node ?p ?value ;
        ?pExtension ?extensionValue .

(I have not tested this, so I am merely speculating.)

Is a better approach possible, for explicitly connecting fhir:active properties to their corresponding fhir:_active properties?

hsolbrig commented 4 years ago

Could we get rid of the lexical connection all together? And combine basic data and extensions? Using N3 we could say:

_:A1 fhir:active true.
_:A1 fhir:active _:B1.
_:B1 a fhir:Extension .
_:B1 <http://example.org/fhir/boolean/Certainty> "0.75"^^xsd:decimal .

The turtle equivalent would be:

[] fhir:active   true,
   [
      a fhir:Extension ;
      <http://example.org/fhir/boolean/Certainty> "0.75"^^xsd:decimal
   ] .

for "complex data types" (e.g. inner elements):

_:A2 fhir:Observation.effectivePeriod _:B2 .
_:B2 fhir:Period.start "2013-04-02T09:30:10+01:00"^^xsd:dateTime .
_:B2 fhir:Period.end "2019-04-02T09:30:10+17:00"^^xsd:dateTime .
_:A2 fhir:Observation.effectivePeriod _:C2 .
_:C2 a fhir:Extension .
_:C2 <http://example.org/fhir/Observation/fancyExtension> _:D2 .
_:D2  :p1 :o1 .
_:D2  :p2 :o2 .
_:C2 <http://example.org/fhir/Observation/notSofancyExtension> 42.

The Turtle equivalent would be akin to

fhir:Observation.effectivePeriod [
     fhir:Period.start "2013-04-02T09:30:10+01:00"^^xsd:dateTime;
     fhir:Period.end "2019-04-02T09:30:10+17:00"^^xsd:dateTime;
     fhir:extension
     [
              a fhir:Extension;
              <http://example.org/fhir/Observation/fancyExtension> [
                      ...
              ];
              <http://example.org/fhir/Observation/notSofancyExtension> 42.
      ]
  ];
hsolbrig commented 4 years ago

The general idea of the above approach would be that the difference between an extension and a FHIR builtin type would be the a fhir:Extension type arc. FWIW, we could allow type arcs on non-extensions as well:

fhir:Observation.effectivePeriod [
     a fhir:Period ;
     fhir:Period.start "2013-04-02T09:30:10+01:00"^^xsd:dateTime;
     fhir:Period.end "2019-04-02T09:30:10+17:00"^^xsd:dateTime;
]

or even

fhir:Observation.effectivePeriod [
     a fhir:Period ;
     fhir:start "2013-04-02T09:30:10+01:00"^^xsd:dateTime;
     fhir:end "2019-04-02T09:30:10+17:00"^^xsd:dateTime;
]

(I don't think the latter approach would work, however, as there could be a lot of "start" tags...)

dbooth-boston commented 4 years ago

We could certainly take that approach, by using the JSON preprocessor/tweaker to attach fhir:active extensions directly to fhir:active properties. But I was hoping that our design would allow a naive JSON-LD 1.1 processor -- i.e., without using the preprocessor/tweaker -- to produce valid, usable FHIR/RDF, even if it would lack some of the information available if the preprocessor/tweaker were used. But if the monotonicity objective causes too much difficulty then it may not be worth it.

hsolbrig commented 4 years ago

What is the "monotonicity objective" you are referring to in this context? Where do you see this approach violating "[producing] usable FHIR/RDF, even if it would lack some of the information available if the preprocessor/tweaker were used."

dbooth-boston commented 4 years ago

The monotonicity objective is the objective that a naive JSON-LD 1.1 processor -- i.e., operating without our proposes preprocessor/tweaker -- would produce usable valid FHIR/RDF ("minimal R5"); but if the preprocessor/tweaker were use, then the JSON-LD 1.1 processor would produce valid FHIR/RDF that is a superset/supergraph of the minimal R5 RDF. I.e., the minimal R5 RDF would be monotonically improved by using the preprocessor/tweaker.

I may be using the term "monotonicity" a bit loosely here, but I hope that clarifies a bit. And I may also be taking liberties with the term "superset/supergraph", given that blank nodes are involved.

Basically, my thinking is that it would be nice if users could get useful valid FHIR/RDF even if they don't have an appropriate preprocessor readily available, or they don't want the extra step of running it. If adherence to this objective would make the resulting FHIR/RDF too hard to process, then it may not be worth the cost. But if it is achievable, then it would be nice.

If you look at the example in issue w3c/FHIRCat#3, a naive JSON-LD 1.1 processor, when faced with a FHIR extension on fhir:active, will produce RDF have a node with both a fhir:active property and a fhir:_active property (with a leading underscore), with the latter holding the extra bnode and extension data. So if the same FHIR/JSON data is later processed WITH the preprocessor/tweaker that instead puts the extra bnode and extension data on the fhir:active property, then the extra bnode and extension data will appear both on the fhir:active property and on the fhir:_active property. But I guess that's another viable design option also, since the extra fhir:_active properties could be ignored.

dbooth-boston commented 1 year ago

@balhoff or Daniel Stone, was this implemented in the generated OWL ontology?

balhoff commented 1 year ago

@dbooth-boston there are triples in the OWL ontology such as:

fhir:sourceScope fhir:modifierExtensionProperty fhir:_sourceScope .

and

fhir:Patient fhir:modifierExtensionClass fhir:_Patient .
dbooth-boston commented 1 year ago

Done. It is in the ontology. https://build.fhir.org/fhir.rdf.ttl.zip