Open cmungall opened 2 years ago
@cmungall Thanks for the suggestions. Actually, the state of this repo and the org is still a bit confusing. I've just updated the README to clarify things.
@ShahimEssaid Probably what we should do is rename this repository from hapi-fhir-jpaserver-starter
to hapi-projects
, and force push the current codebase here. That way our code and issues are all in the same repository.
@cmungall I edited your OP to add numbers to the questions for easy reference. Taking a stab now at answering some of them. (@chrisroederucdenver @ShahimEssaid we can add answers to these in some docs / the README at some point perhaps).
.owl
file) -> fhir-owl
tool converts to a CodeSystem JSON -> POST || PUT to server.fhir-owl
tool not created yet. We may want to continue the discussion there when I create that repo. My general idea though is to simply allow for as much as possible. Anything not recognized will be added as FHIR extension elements. It's very lenient. Basically accepts whatever you give it with no prior data modeling needed. We can move towards more formal definitions and constraints in the future as time allows.
5.i. Most likely I will map rdfs:label
directly to concept names.
5.ii. Going to try to add as much of this as possible, using FHIR extensions as needed.
5.iii. That's a good question. I haven't closely looked at the Obographs output yet, but my hope is that I would add information contained within these axioms to the concepts; a lot of it will be FHIR extension eleemnts.
5.iv. FHIR natively supports multi-hierarchy using "properties". It is easy to add your own properties to a code system and have concepts use them. "Properties" is a bit of a misnomer, IMO; it refers both to concept properties as well as relationship types.Thanks, this is useful to me, hopefully it is useful to you to express these things (I don't want to create busy work or distractions)
We're conforming to support FHIR R4 and hopefully soon R5 as well. We do have extensions, e.g. SSSOM fields on ConceptMap, but we don't have a formal data model written
Got it, thanks
A single ubuntu server.
sorry I meant architecture not infrastructure!
A little bit about it here
I'll check this out but it's a bit hard to grok without insider knowledge. I am sure that if I check the HAPI docs I will learn more, e.g is there a relational database or triplestore or mongo or etc as back end
The idea is: (.owl file) -> fhir-owl tool converts to a CodeSystem JSON -> POST || PUT to server
Maybe I am just too old school but I am wary of services based solutions where file-based ingest would work, there are issues with timeouts on large ontologies and authentication, but you can likely ignore my concerns here if the broader hapi infrastructure works happily here
This is under the purview of a new fhir-owl tool not created yet
got it. This is one area where I could help at least give sanity checks, I have a lot of experience with wild west ontologies doing all kinds of things that do surprising things with metadata modeling
- Not sure what you mean
Some ontology services like OLS will run an OWL reasoner to do a classification step in advance first, unless that ontology is configured otherwise. I think that is a bad idea but this is subject of ongoing discussion between myself, Nico, David OS, Jim, and others. If you don't do a classification step you are bound to run into some ontology that releases a version that hasn't been classified in advance. This is nuts imo but if you don't have a strategy in place then the ontology will look fragmented and flat. The best thing to do is say: this is not the wild west, there are some standards your owl must adhere to before we include it
The other reasoning use case is closures. If you want to include anatomy ontologies like uberon that don't use SNOMED SEP hacks then you likely want to support operations like "include all parts of the brain". You can approximate this by doing naive graph walking over a set of predicates but it's better to run relation-graph ahead of time to precompute the closure.
@cmungall re: 4 "what is the plans for a registry and how is that coordinated with other registries" Are you thinking of registries, like a longitudinal clinical study? If not please clarify, maybe provide some links.
@cmungall 2 "what is the infrastructure. E.g. are all ontologies loaded into main memory and served that way? is there a database?" "A single ubuntu server." "sorry I meant architecture not infrastructure!"
HAPI-FHIR is an implementation of the FHIR spec. API meant to serve a combination of clinical and terminological data. jpaserver is reference to a back-end to the API that uses a database like postgres to server both. I saw a question elsewhere about about concerns I understand as trying to empty the pool through a straw. That is, if you're trying to get a whole ontology (even a large part), an API meant for interaction with single concepts will indeed have performance issues. That's not the FHIR use-case as I understand it. Consider that the terminology server and the clinical server are often one and the same. When data is entered or modified in the server, it can use the local terminologies to validate those changes. User access like looking up a term by id or text, are also reasonably served with such an architecture.
As far as emptying the pool and doing so in a way flexible to different schemas, this the reason I'm interested in LinkML, OAK, etc. When the goal is to become the RedHat of vocabulary distributions, you need to be able to deal with concepts en mass, not individual concepts. So there are at least two very different use-cases in play.
@cmungall Here's the pool-straw issue: 3
"what is the overall data flow for how external ontologies get loaded? via the OWLAPI?" "The idea is: (.owl file) -> fhir-owl tool converts to a CodeSystem JSON -> POST || PUT to server." "Maybe I am just too old school but I am wary of services based solutions where file-based ingest would work, there are issues with timeouts on large ontologies and authentication, but you can likely ignore my concerns here if the broader hapi infrastructure works happily here"
When the use-case involves whole ontologies, yeah, emptying the pool through a straw is worth careful consideration.
(edit) slide 7 in an OAK slide deck has more of ChrisM's thinking.
6 is gold. Computing closures has come up: "what is the strategy for reasoning 6.i is there an assumption ontologies are pre-classified (this is a good assumption IMO)"
Some ontology services like OLS will run an OWL reasoner to do a classification step in advance first, unless that ontology is configured otherwise. I think that is a bad idea but this is subject of ongoing discussion between myself, Nico, David OS, Jim, and others. If you don't do a classification step you are bound to run into some ontology that releases a version that hasn't been classified in advance. This is nuts imo but if you don't have a strategy in place then the ontology will look fragmented and flat. The best thing to do is say: this is not the wild west, there are some standards your owl must adhere to before we include it
The other reasoning use case is closures. If you want to include anatomy ontologies like uberon that don't use SNOMED SEP hacks then you likely want to support operations like "include all parts of the brain". You can approximate this by doing naive graph walking over a set of predicates but it's better to run relation-graph ahead of time to precompute the closure.
I'll probably create a separate issue ticket for this but regardless, ChrisM has my attention.
Registries: I meant of ontologies.
E.g. let's say you are standing up a service that provides access to 50 vocabularies. There is presumably some kind of ETL process to bring those in. Maybe trivial - e.g. if all vocabularies are in OWL and are available from a public URL (unlikely for closed clinical terminologies I know). But more often than not what happens is you end up spinning up a system with metadata on each of your sources, with all of the bespoke configurations each one needs (e.g. to load skos vocabulary X, we need to map foaf:name to rdfs:label)
Regarding (3), I agree it is better to have a way to load without the need for HTTP. Here's how we're loading them now (example). The server URL is defined by env variable HAPI_R4
, which isn't defined in that script, but I believe localhost
will work fine, so at least there's that.
Regarding (6), I agree w/ Chris R that this is a really good point you bring up. I added a step to the existing OWL/OBO issue to go over our results (after we've converted and uploaded to the server) and learn more about what we should do for reasoning/classification.
edit: Just copy/pasting here the update to README.md
that I made recently:
TIMS (Terminology Infrastructure Management Systems), AKA HOT (Health Open Terminology) ecosystem, is developing a FHIR server: http://20.119.216.32:8000/r4/swagger-ui/
This repository is not the actual codebase being deployed, but is simply a holding place for issues.
The current codebase is here.
The README is useful to understand how to set this up, but for an outsider like me it's difficult to tell what the overall objectives are, and how this relates to https://github.com/hapifhir/hapi-fhir
In particular, it would be useful to know
Prioritize this issue accordingly. Maybe I am missing some documentation elsewhere. If all devs know what they are doing then you can close this. But I think I could be of more use if some basic assumptions were stated somewhere. It may be more useful for you all too.