Open rjalexa opened 2 years ago
Oh yes, I think I see that. You've chosen to only import the Full files, but the transitive closure routine works with the Snapshot file. I've always imported 'A' for all files, and then the Snapshot file will be found. The reason that the xsct variant is being complained about is because it checks for sct2 first, and then xsct2 so the error message is being a bit lazy there.
I should add in a check that refuses to generate the transitive closure if the Snapshot files are not extracted, but in the meantime, please either select Snapshot or All if you want a transitive closure generated.
Bear with my ignorance please. I can only download the file mentioned in the first post following the SNOMED page links (https://uts.nlm.nih.gov/uts/). So what is the meaning of your suggestion on A or Snapshot? Not sure I know how to get the latter. Thanks a lot for your help.
Oh sorry, I said select 'A' but I see that this code wants you to write out 'ALL' for your command line argument, instead of 'FULL' as you've done above here. Alternatively SNAPSHOT will be quicker if you don't need to know about recent or historical changes.
The zip file should contain 3 sections, one for Delta (which is just the changes since the last release), the Snapshot (which is the current state of the ontology) and the Full files (which contain the complete history of the product dating back to 2002). Unless you're doing something very specific with historical records, or tracking changes to the product over time, you're more likely to want the Snapshot. That's SNOMED CT as it exists today.
"Full" is perhaps a misleading term for what those files are - they're not just everything that is, they're also everything that was.
Very clear. Thanks. I therefore invoked the script as follows:
./load_release.sh ../../../SNOMEDCT_INT_202210/SnomedCT_InternationalRF2_PRODUCTION_20221031T120000Z.zip snomedct20221031 SNAP
and works perfectly.
Also thanks to your indication I now have well understood I really want the snapshot since I'm not interested in the historical dimension.
Thanks a lot !!!!
PS Oh how I wish SNOMED was available in RDF/OWL !!!! :)
Oh but it is! Or can be, with a transformation. See https://github.com/IHTSDO/snomed-owl-toolkit
Enjoy!
One last question if I may. The loading went smoothly and have SNOMED's tables in the database. Using DBVisualizer I tried generating the ER view which shows the tables but no relationships between them. Maybe the load process should generate foreign keys and doesn't?
Yes that would be a nice feature. We use Elasticsearch internally, so such improvements wouldn't have a direct business benefit for us I'm afraid. Hopefully someone will contribute such a thing as a pull request.
Here's a rough ER diagram showing the links between the core component types : https://confluence.ihtsdotools.org/display/DOCSTART/5.+SNOMED+CT+Logical+Model but also various refset members link to referenced concepts. If you have any specific questions I can answer those, or if it's just a more general "what's going on", then we have a ton of free education material available.
The OWL transformation tool worked flawlessly and personally loading it into Protegè to browse it gives me su much more insight to the monumental work than perusing the relational tables which at this point I'll probably drop. I wonder how come RDF is not THE main format you use to disseminate this. So much more expressive than a relational schema IMHO. Thanks a lot for your help. PS I kind of sense the presence of the ghost :) of Barry Smith the ontologist behind BFO ... am I right? :) PPS Stumbling on the concept of "Hang Glider" made my day. SNOMED almost describes all possible reality :) :)
Hi Robert. Yes there's no doubt that OWL is much more expressive (especially in terms of nesting), and a number of compromises were made in order to fit the simplified distribution standard of RF2. However, as I said in my email, RF2 does also give us a number of advantages around tracking changes over time and information that cannot be expressed in OWL and file size efficiency was more of a concern back in the day. That said, we do now use OWL (albeit chopped up into a Reference Set) for our Stated View.
We have had meetings with the BFO crowd and put substantial thought into aligning SNOMED CT with BFO. I can point you in the direction of this discussion if you're interested in reading further. But no, Barry Smith was not involved directly with the design of SNOMED CT as far as I'm aware.
Yes I often post a "Concept of the Week" on a Friday for my colleagues. My pick for Halloween was 111946000 |Burial alive (finding)|
So SNOMED aims to exist under a closed world assumption: what is not there does not exist :) :) (joking of course) Thanks a lot.
No, it's an important point. It's one of the main distinctions with a classification like ICD where everything must exist in exactly one bucket, and all the buckets add up to 100%.
script invoked as:
./load_release.sh ../../../SNOMEDCT_INT_202210/SnomedCT_InternationalRF2_PRODUCTION_20221031T120000Z.zip snomedct20221031 FULL
If I reply Yes to the
Calculate and store inferred transitive closure? [Y/N]:
I see the following happening: