Closed djbeaumont closed 2 years ago
Looks as if those files are not meeting the file naming standard.
If you rename them appropriately, Hermes will find them.
You can use clj -M:run list xxxx
To list files that can be processed in a given directory
See naming conventions https://confluence.ihtsdotools.org/plugins/servlet/mobile?contentId=56330817#content/view/56330817
Rename to match and Hermes will find them.
It's likely those are just simple refsets so actually you don't need them for your auto completion. You'd use ^ to constrain to a refset not << of course.
I had a quick look at the files. SNOMED supports arbitrary refsets using a dynamic system called refset descriptors, which encode the type of each column within a refset. It's designed to permit this kind of extra customisation on a distribution level. Currently hermes does not support arbitrary refset items, although it is conceivable that it could support them with a fixed internal registry for known, and a dynamic registry, based on refset descriptor items and serialising arbitrary data found in those files (ie acting as a dumb key value store for those items).
Another option would be to write something that converts those files into something like a simple refset, so you can just go ^
Closed in favour of #30
Looks like the same issue as #31.
Hi @djbeaumont , I think this is now sort of fixed in v0.8.3.
There are some issues with the Spanish data. They don't use a decent naming system. If you see, they should use a prefix for refsets so we know how to encode the extra columns. To me, it looks as if they are using extra columns when they should be using a proper language reference set. Anyway, it should pick up more refsets now, so you will at least be able to check refset membership. You might need to change the file names and use the 'list' command to see what it makes of the files.
It also now is fast at failing when broken data is found like the incorrect dates or concept identifiers we spotted previously!
Let me know how you get on.
PS. It does support storing extra columns of data in a refset but I haven't exposed those data via the public API presently. Conceivably I could write out the data even with the correct property names as the refset descriptor for that refset should have that information for each column.
Thanks Mark, I'll give this a shot 👍
Hermes@v0.8.3 does now find the extra files:
ᐅ clj -M:run list ../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/
====================================================================================================================================
| Distribution files in ../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/:14 |
====================================================================================================================================
| :filename | :component | :version-date | :format | :content-subtype | :content-type |
|-------------------------------------------------------------------------------------+----------------------+---------------+---------+-------------------------------------------------------+---------------|
| der2_scicRefset_VMPPSpainDrugSnapshot_es-ES_ES_20211001.txt | ExtendedRefset | 2021-10-01 | 2 | VMPPSpainDrugSnapshot | scicRefset |
| der2_sRefset_VTMSpainDrugSnapshot_es-ES_ES_20211001.txt | ExtendedRefset | 2021-10-01 | 2 | VTMSpainDrugSnapshot | sRefset |
| der2_cRefset_AssociationSpainDrugExtensionSnapshot_es-ES_ES_20211001.txt | AssociationRefset | 2021-10-01 | 2 | AssociationSpainDrugExtensionSnapshot | cRefset |
| der2_cRefset_FMSpainDrugSnapshot_es-ES_ES_20211001.txt | ExtendedRefset | 2021-10-01 | 2 | FMSpainDrugSnapshot | cRefset |
| der2_cRefset_FMFormatoSpainDrugSnapshot_es-ES_ES_20211001.txt | ExtendedRefset | 2021-10-01 | 2 | FMFormatoSpainDrugSnapshot | cRefset |
| der2_scRefset_VMPSpainDrugSnapshot_es-ES_ES_20211001.txt | ExtendedRefset | 2021-10-01 | 2 | VMPSpainDrugSnapshot | scRefset |
| der2_cRefset_VMPPCNSpainDrugMapSnapshot_es-ES_es_20211001.txt | ExtendedRefset | 2021-10-01 | 2 | VMPPCNSpainDrugMapSnapshot | cRefset |
| der2_cRefset_AttributeValueSpainDrugExtensionSnapshot_es-ES_ES_20211001.txt | AttributeValueRefset | 2021-10-01 | 2 | AttributeValueSpainDrugExtensionSnapshot | cRefset |
| der2_cRefset_LanguageSpainDrugExtensionSnapshot_es-ES_ES_20211001.txt | LanguageRefset | 2021-10-01 | 2 | LanguageSpainDrugExtensionSnapshot | cRefset |
| der2_cRefset_LanguageSpainDrugExtensionSnapshot_en_20211001.txt | LanguageRefset | 2021-10-01 | 2 | LanguageSpainDrugExtensionSnapshot | cRefset |
| der2_ciRefset_RefsetDescriptionTypeSpainDrugExtensionSnapshot-es-ES_ES_20211001.txt | ExtendedRefset | 2021-10-01 | 2 | RefsetDescriptionTypeSpainDrugExtensionSnapshot-es-ES | ciRefset |
| sct2_Description_SpainDrugExtensionSnapshot_es-ES_ES_20211001.txt | Description | 2021-10-01 | 2 | SpainDrugExtensionSnapshot | Description |
| sct2_Concept_SpainDrugExtensionSnapshot_es-ES_ES_20211001.txt | Concept | 2021-10-01 | 2 | SpainDrugExtensionSnapshot | Concept |
| sct2_Relationship_SpainDrugExtensionSnapshot_es-ES_ES_20211001.txt | Relationship | 2021-10-01 | 2 | SpainDrugExtensionSnapshot | Relationship |
And I can import with:
ᐅ clj -M:run --db snomed.db import ../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release
2021-11-05 13:22:41,562 [main] INFO com.eldrix.hermes.importer - importing files from "../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release"
2021-11-05 13:22:41,578 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Content/der2_scicRefset_VMPPSpainDrugSnapshot_es-ES_ES_20211001.txt" type: "ExtendedRefset"
2021-11-05 13:22:41,638 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Content/der2_sRefset_VTMSpainDrugSnapshot_es-ES_ES_20211001.txt" type: "ExtendedRefset"
2021-11-05 13:22:41,646 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Content/der2_cRefset_AssociationSpainDrugExtensionSnapshot_es-ES_ES_20211001.txt" type: "AssociationRefset"
2021-11-05 13:22:41,647 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Content/der2_cRefset_FMSpainDrugSnapshot_es-ES_ES_20211001.txt" type: "ExtendedRefset"
2021-11-05 13:22:41,648 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Content/der2_cRefset_FMFormatoSpainDrugSnapshot_es-ES_ES_20211001.txt" type: "ExtendedRefset"
2021-11-05 13:22:41,649 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Content/der2_scRefset_VMPSpainDrugSnapshot_es-ES_ES_20211001.txt" type: "ExtendedRefset"
2021-11-05 13:22:41,685 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Content/der2_cRefset_VMPPCNSpainDrugMapSnapshot_es-ES_es_20211001.txt" type: "ExtendedRefset"
2021-11-05 13:22:41,850 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Content/der2_cRefset_AttributeValueSpainDrugExtensionSnapshot_es-ES_ES_20211001.txt" type: "AttributeValueRefset"
2021-11-05 13:22:41,851 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Language/der2_cRefset_LanguageSpainDrugExtensionSnapshot_es-ES_ES_20211001.txt" type: "LanguageRefset"
2021-11-05 13:22:41,937 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Language/der2_cRefset_LanguageSpainDrugExtensionSnapshot_en_20211001.txt" type: "LanguageRefset"
2021-11-05 13:22:41,949 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Metadata/der2_ciRefset_RefsetDescriptionTypeSpainDrugExtensionSnapshot-es-ES_ES_20211001.txt" type: "ExtendedRefset"
2021-11-05 13:22:41,952 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Terminology/sct2_Description_SpainDrugExtensionSnapshot_es-ES_ES_20211001.txt" type: "Description"
2021-11-05 13:22:42,203 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Terminology/sct2_Concept_SpainDrugExtensionSnapshot_es-ES_ES_20211001.txt" type: "Concept"
2021-11-05 13:22:42,251 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Terminology/sct2_Relationship_SpainDrugExtensionSnapshot_es-ES_ES_20211001.txt" type: "Relationship"
(exits without error)
I'd now expect to be able to list all VMPs based on the refset:
ᐅ head ../../Downloads/SNOMED_CT_SPANISH/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Content/der2_scRefset_VMPSpainDrugSnapshot_es-ES_ES_20211001.txt
id effectiveTime active moduleId refsetId referencedComponentId term linkedToId
69a4f503-bdfb-4d2e-c323-59e3e8041020 20170901 1 90000011000140108 90000061000140106 324881002 Abacavir 20 mg/ml solución/suspensión oral 116084008
1323d796-cdac-4ba3-fa9e-802f28ac5607 20170901 1 90000011000140108 90000061000140106 324880001 Abacavir 300 mg comprimido 116084008
ac0411af-a5f9-44f6-9d37-1254f02a1776 20170901 1 90000011000140108 90000061000140106 413382007 Abacavir/Lamivudina 600 mg/300 mg comprimido 413381000
28be96cc-245d-4163-cf70-c4bcffa0e799 20170901 1 90000011000140108 90000061000140106 377159003 Abacavir/Lamivudina/Zidovudina 300 mg/150 mg/300 mg comprimido 134571004
c1cf2946-cba9-48c9-977b-91c3b18cf274 20170901 1 90000011000140108 90000061000140106 130241000140101 Abatacept 125 mg inyectable 1 ml jeringa precargada 421412005
518ae81e-99b1-4356-eedf-588aa7fd25bd 20170901 1 90000011000140108 90000061000140106 160121000140108 Abatacept 125 mg inyectable 1 ml pluma precargada 421412005
829852d0-e2bc-42c6-fe42-b5c274bc7dc1 20170901 1 90000011000140108 90000061000140106 421333000 Abatacept 250 mg inyectable perfusión 421412005
3a77e008-83f8-4458-cbec-48e8133ab8ab 20170901 1 90000011000140108 90000061000140106 319794009 Abciximab 2 mg/ml inyectable perfusión 5 ml 108974006
2d9b5291-c1e9-4a07-e88c-8326411b051c 20190401 1 90000011000140108 90000061000140106 246841000140102 Abemaciclib 100 mg comprimido 246981000140101
...
So assuming the refset ID is 90000061000140106
for Spanish VMPs, I thought a query to hades like the following would work:
ᐅ curl "http://localhost:8080/fhir/ValueSet/\$expand?url=http://snomed.info/sct?fhir_vs=refset/90000061000140106"
{"resourceType":"ValueSet","expansion":{"total":0}}%
So I'm a bit stumped. Is there a way to list refsets that have been loaded?
Possibly a separate issue: all the above was done on my mac. On a linux VM in a github action I get the following errors on import:
Step 20/30 : RUN clj -M:run --db snomed.db import ../content/snomed-es-drugs/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/
---> Running in a75435da8958
2021-11-05 14:00:27,579 [main] INFO com.eldrix.hermes.importer - importing files from "../content/snomed-es-drugs/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/"
2021-11-05 14:00:27,605 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../content/snomed-es-drugs/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Content/der2_cRefset_AttributeValueSpainDrugExtensionSnapshot_es-ES_ES_20211001.txt" type: "AttributeValueRefset"
2021-11-05 14:00:27,610 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../content/snomed-es-drugs/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Content/der2_sRefset_VTMSpainDrugSnapshot_es-ES_ES_20211001.txt" type: "ExtendedRefset"
2021-11-05 14:00:27,665 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../content/snomed-es-drugs/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Content/der2_cRefset_AssociationSpainDrugExtensionSnapshot_es-ES_ES_20211001.txt" type: "AssociationRefset"
2021-11-05 14:00:27,667 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../content/snomed-es-drugs/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Content/der2_cRefset_VMPPCNSpainDrugMapSnapshot_es-ES_es_20211001.txt" type: "ExtendedRefset"
2021-11-05 14:00:28,039 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../content/snomed-es-drugs/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Content/der2_scRefset_VMPSpainDrugSnapshot_es-ES_ES_20211001.txt" type: "ExtendedRefset"
2021-11-05 14:00:28,134 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../content/snomed-es-drugs/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Content/der2_cRefset_FMFormatoSpainDrugSnapshot_es-ES_ES_20211001.txt" type: "ExtendedRefset"
2021-11-05 14:00:28,138 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../content/snomed-es-drugs/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Content/der2_cRefset_FMSpainDrugSnapshot_es-ES_ES_20211001.txt" type: "ExtendedRefset"
2021-11-05 14:00:28,142 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../content/snomed-es-drugs/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Content/der2_scicRefset_VMPPSpainDrugSnapshot_es-ES_ES_20211001.txt" type: "ExtendedRefset"
2021-11-05 14:00:28,403 [async-thread-macro-5] INFO com.eldrix.hermes.importer - Processing: "../content/snomed-es-drugs/SnomedCT_SpainDrugExtension-ES_PRODUCTION_20211001T120000/RF2Release/Snapshot/Refset/Language/der2_cRefset_LanguageSpainDrugExtensionSnapshot_es-ES_ES_20211001.txt" type: "LanguageRefset"
2021-11-05 14:00:29,652 [pool-1-thread-2] ERROR com.eldrix.hermes.impl.store - import error: failed to import data: {:type :info.snomed/ExtendedRefset, :parser #object[clojure.core$partial$fn__5857 0x1f307421 "clojure.core$partial$fn__5857@1f307421"], :headings ["id" "effectiveTime" "active" "moduleId" "refsetId" "referencedComponentId" "term"], :data [#com.eldrix.hermes.snomed.ExtendedRefsetItem{:id #uuid "c6bc05db-e55f-4134-93cf-93a6fb7b7a67", :effectiveTime #object[java.time.LocalDate 0x2219bd26 "2019-07-01"], :active true, :moduleId 90000011000194102, :refsetId 90000031000194107, :referencedComponentId 2021000194105, :fields ()}]}
2021-11-05 14:00:30,991 [pool-1-thread-1] ERROR com.eldrix.hermes.impl.store - import error: failed to import data: {:type :info.snomed/ExtendedRefset, :parser #object[clojure.core$partial$fn__5857 0x43f2933f "clojure.core$partial$fn__5857@43f2933f"], :headings ["id" "effectiveTime" "active" "moduleId" "refsetId" "referencedComponentId" "term"], :data [#com.eldrix.hermes.snomed.ExtendedRefsetItem{:id #uuid "82449a2a-c955-4cbf-b0b3-5764bc2249ef", :effectiveTime #object[java.time.LocalDate 0x246fccbb "2019-07-01"], :active true, :moduleId 90000011000194102, :refsetId 90000021000194105, :referencedComponentId 1941000194109, :fields ()}]}
Execution error (ExceptionInfo) at com.eldrix.hermes.core/do-import-snomed (core.clj:346).
Error during import: Import error
So it looks like parsing the extra columns into the fields
map has failed, but otherwise it's not clear to me what's going on here and why it's different on the different platforms.
Hi @djbeaumont - you can get information about installed reference sets using the 'status' command.
e.g.
clj -M:run status --db my-snomed.db
I'd stick to using hermes only until we know it is importing. Use ECL to expand a value set, so
http '127.0.0.1:8080/v1/snomed/expand?ecl=^90000061000140106'
should work to give you the components, or preferably, use search?s=xxx&ecl=^90000061000140106 to simply search within components of that refset. That would be faster.
At the moment, it won't return the extra fields but that would be easy to fix. I don't understand the issue on linux - I doubt it is a platform-specific issue. There should be more detailed error information written to a temporary file at the end of the import - which will give us more information.
I have rewritten the handling of extensions for reference set items in 9b7e672eee50275e05c58bea9316d0b3c9b03b48 as I have not been satisfied with the previous approach. As such, all extended fields should now be imported properly. I will next make the additional properties available using the metadata in the reference set descriptor. In the meantime, the extension data will be returned in the API, and the property names available from the corresponding reference set descriptor manually. Re-opening this issue.
Fixed in 7309357e8e6c1025db6ea0454f5da9c9e2758a51 including being made available in the HTTP API. Attributes, including arbitrary extended attributes are included in the results.
Hi Mark, sorry, me again.
My next issue with using hermes to host a Spanish SNOEMD terminology service is that it doesn't seem to be importing all refsets from the release files. Below are the logs from my build script and as you can see it's importing four snapshot refsets:
In the
Refset/Content
directory there are more files that I think are important to my use-case, including:I couldn't spot anything in the documentation for importing refsets as it looks like it tries to do everything automagically. My aim is to produce typeahead drug pickers, hence the need for those refsets. In the UK it's not really necessary since all the medication products have relationships meaning you can do ECL queries of the form
< VMP
. The Spanish version seems a lot less fleshed out, so I'm hoping the refsets will provide what I need.All build output for this step: