OpenConceptLab / ocl_issues

Issues for all OCL repos. NOTE: Install ZenHub Browser Extension and request access to the OCL Roadmap board to view all issues and to contribute
4 stars 2 forks source link

Snomed and LOINC - Systematic content validation #843

Closed jamlung-ri closed 3 years ago

jamlung-ri commented 3 years ago

Joe to work with Jon to write up the basic requirements for scripts to validate the LOINC and SNOMED content that has been test-loaded into OCL Staging.

jamlung-ri commented 3 years ago

LOINC was validated automatically, but the SNOMED export is still processing. Need to push this until we can get that export.

jamlung-ri commented 3 years ago

I have been using Burke's validator script to automatically validate SNOMED to determine if all expected concepts and mappings are present. I have also been looking in the SNOMED hierarchy to look for unexpected results.

Here are my findings so far:

@paynejd This seems pretty significant. I will put this on the Arch agenda for this week so we can come up with a game plan.

jamlung-ri commented 3 years ago

Sunny, Jon, and I agreed that this many issues warrants the full deletion of SNOMED from OCL Staging. It will then be reimported and checked again for correctness.

jamlung-ri commented 3 years ago

LOINC Validation output:

{ "missing_concepts":[], "extra_concepts": [], "Concepts compared": 152184, "missing_mappings": [ {"type": "Mapping", "map_type": "Associated Observations", "to_concept_url": "/orgs/Regenstrief-Institute/sources/LOINC-3/concepts/81323-8/", "from_concept_url": "/orgs/Regenstrief-Institute/sources/LOINC-3/concepts/81323-8/", "source": "LOINC-3", "owner_type": "Organization", "owner": "Regenstrief-Institute"} ], "Mappings compared": 261092, "extra_mappings": [] }

Note that the missing mapping has since been added.

jamlung-ri commented 3 years ago

SNOMED Validation output:

{ "missing_concepts":[ {"type": "Concept", "id": "255135009", "concept_class": "none", "datatype": "string", "extras": {"NHS Clinical Terms Version 3": "X78lh", "UMLS CUI": "C0347084", "UMLS Semantic Type": "Neoplastic Process", "Concept Status": "Inactive", "MODULEID": "900000000000207008", "Code in Source": "255135009"}, "retired": true, "source": "SNOMED", "names": [{"locale": "en", "locale_preferred": true, "name": "Carcinoma in situ of labial mucosa (disorder)", "name_type": "Fully Specified"}, {"locale": "en", "locale_preferred": false, "name": "Carcinoma in situ of oral aspect of lip", "name_type": "Synonym"}, {"locale": "en", "locale_preferred": false, "name": "Carcinoma in situ of inner aspect of lip", "name_type": "Synonym"}, {"locale": "en", "locale_preferred": false, "name": "Carcinoma in situ of mucosa of lip", "name_type": "Synonym"}, {"locale": "en", "locale_preferred": false, "name": "Carcinoma in situ of buccal aspect of lip", "name_type": "Synonym"}, {"locale": "en", "locale_preferred": false, "name": "Carcinoma in situ of labial mucosa", "name_type": "Synonym"}, {"locale": "en", "locale_preferred": false, "name": "Carcinoma in situ of labial mucosa (disorder)", "name_type": "Synonym"}]}, {"type": "Concept", "id": "419602005", "concept_class": "none", "datatype": "string", "extras": {"NHS Clinical Terms Version 3": "XUdJK", "UMLS Semantic Type": "Neoplastic Process", "UMLS CUI": "C1641587", "Concept Status": "Inactive", "MODULEID": "900000000000207008", "Code in Source": "419602005"}, "retired": true, "source": "SNOMED", "names": [{"locale": "en", "locale_preferred": true, "name": "Carcinoma in situ of lacrimal drainage structure (disorder)", "name_type": "Fully Specified"}, {"locale": "en", "locale_preferred": false, "name": "Carcinoma in situ of lacrimal drainage structure (disorder)", "name_type": "Synonym"}, {"locale": "en", "locale_preferred": false, "name": "Carcinoma in situ of lacrimal duct", "name_type": "Synonym"}, {"locale": "en", "locale_preferred": false, "name": "Carcinoma in situ of lacrimal drainage structure", "name_type": "Synonym"}, {"locale": "en", "locale_preferred": false, "name": "Ca in situ of lacrimal drainage structure", "name_type": "Synonym"}]}, {"type": "Concept", "id": "115119004", "concept_class": "none", "datatype": "string", "extras": {"NHS Clinical Terms Version 3": "XU3iX", "UMLS Semantic Type": "Archaeon", "UMLS CUI": "C0524917", "Concept Status": "Active", "MODULEID": "900000000000207008", "Code in Source": "115119004"}, "retired": false, "source": "SNOMED", "names": [{"locale": "en", "locale_preferred": true, "name": "Family Thermococcaceae (organism)", "name_type": "Fully Specified"}, {"locale": "en", "locale_preferred": false, "name": "Family thermococcaceae", "name_type": "Synonym"}, {"locale": "en", "locale_preferred": false, "name": "Family thermococcaceae (organism)", "name_type": "Synonym"}, {"locale": "en", "locale_preferred": false, "name": "Thermococcaceae", "name_type": "Synonym"}, {"locale": "en", "locale_preferred": false, "name": "Family Thermococcaceae", "name_type": "Synonym"}, {"locale": "en", "locale_preferred": false, "name": "Family Thermococcaceae (organism)", "name_type": "Synonym"}], "parent_concept_urls": ["/orgs/International-Health-Terminology-Standard-Development-Organisation-IHTSDO-/sources/SNOMED/concepts/115212001/"]}, {"type": "Concept", "id": "365951004", "concept_class": "none", "datatype": "string", "extras": {"NHS Clinical Terms Version 3": "XUTVe", "UMLS CUI": "C1287494", "UMLS Semantic Type": "Finding", "Concept Status": "Active", "MODULEID": "900000000000207008", "Code in Source": "365951004"}, "retired": false, "source": "SNOMED", "names": [{"locale": "en", "locale_preferred": true, "name": "Finding of number of current sexual partners (finding)", "name_type": "Fully Specified"}, {"locale": "en", "locale_preferred": false, "name": "Number of current sexual partners - finding", "name_type": "Synonym"}, {"locale": "en", "locale_preferred": false, "name": "Finding of number of current sexual partners (finding)", "name_type": "Synonym"}, {"locale": "en", "locale_preferred": false, "name": "Finding of number of current sexual partners", "name_type": "Synonym"}], "parent_concept_urls": ["/orgs/International-Health-Terminology-Standard-Development-Organisation-IHTSDO-/sources/SNOMED/concepts/365950003/"]}, {"type": "Concept", "id": "365055005", "concept_class": "none", "datatype": "string", "extras": {"NHS Clinical Terms Version 3": "XUTHD", "UMLS Semantic Type": "Finding", "UMLS CUI": "C1286726", "Concept Status": "Active", "MODULEID": "900000000000207008", "Code in Source": "365055005"}, "retired": false, "source": "SNOMED", "names": [{"locale": "en", "locale_preferred": true, "name": "Finding related to ability to infer meaning (finding)", "name_type": "Fully Specified"}, {"locale": "en", "locale_preferred": false, "name": "Finding related to ability to infer meaning (finding)", "name_type": "Synonym"}, {"locale": "en", "locale_preferred": false, "name": "Finding related to ability to infer meaning", "name_type": "Synonym"}], "parent_concept_urls": ["/orgs/International-Health-Terminology-Standard-Development-Organisation-IHTSDO-/sources/SNOMED/concepts/365048008/"]} ], "extra_concepts": [], "Concepts compared": 466612, "missing_mappings": [], "Mappings compared": 126804, "extra_mappings": [] }

Note that there is extra output that lists the duplicate concepts, which is not included here but can be upon request.

jamlung-ri commented 3 years ago

LOINC has been validated using both the validator script and the hierarchy scan. No issues were found. SNOMED will still be reimported.