camwebb / oddx-arch

This project is about creating an open, reliable, global database that associates symptoms with diseases
http://openddx.net/
12 stars 3 forks source link

Establish a mapping between ICD10 and ICD9, which is used by DOID #1

Open CloCkWeRX opened 11 years ago

CloCkWeRX commented 11 years ago

In order to get data extracts up and running, we need to investigate and automate the mapping of ICD10 terms to ICD9 terms, to better leverage DOID (http://www.berkeleybop.org/ontologies/owl/DOID).

CloCkWeRX commented 11 years ago

http://www.ama-assn.org/ama/pub/physician-resources/solutions-managing-your-practice/coding-billing-insurance/hipaahealth-insurance-portability-accountability-act/transaction-code-set-standards/icd10-code-set.page

suggests

The differences between ICD-9 and ICD-10 are significant and physicians and practice management staff need to start educating themselves now about this major change so that they will be able to meet the October 1, 2014 compliance deadline. ICD-10-CM codes are the ones designated for use in documenting diagnoses. They are 3-7 characters in length and total 68,000, while ICD-9-CM diagnosis codes are 3-5 digits in length and number over 14,000. The ICD-10-PCS are the procedure codes and they are alphanumeric, 7 characters in length, and total approximately 87,000, while ICD-9-CM procedure codes are only 3-4 numbers in length and total approximately 4,000 codes. Moving to ICD-10 is expected to impact all physicians. Due to the increased number of codes, the change in the number of characters per code, and increased code specificity, this transition will require significant planning, training, software/system upgrades/replacements, as well as other necessary investments. Before the ICD-10 codes can be used however, physicians and others in the health care community had to transition to use of the new version of HIPAA transaction standards known as 5010 by January 1, 2012, as the current version, 4010, does not accommodate use of the ICD-10 codes.

CloCkWeRX commented 11 years ago

So, questions for @camwebb - 1) Can we simply wait for other ontologies like DOID to update themselves or for other organisations to establish a transition plan? 2) Failing that, can we go for a light touch mapping of ICD10/ICD9 codes to something like dbpedia entities - using skos broader/narrower predicates, rather than owl:sameas assertations?

CloCkWeRX commented 11 years ago

Actually, option 2 might be viable there - freebase data models an ICD9 and ICD10 field - http://www.freebase.com/schema/medicine/symptom?domain=%2Fmedicine for example.

I could readily build a data tool to populate that content.

camwebb commented 11 years ago

Having the DOID ontology is a great asset and I feel we should first look into the creators' roadmap for mapping to ICD10 before striking out in different direction. However, in the end we do want the most comprehensive set of terms. SNOMED CT is also a likely option.

CloCkWeRX commented 11 years ago

Got links to either roadmap/people we should touch base with?

I found http://www.aapc.com/icd-10/conversion-mapping.aspx to be a fairly useful high level explanation of the mapping process, and it seems quite labour intensive to do it properly (especially with the splitting).

CloCkWeRX commented 11 years ago

http://thedatahub.org/dataset/bioportal-snomedct points us to a SPARQL endpoint; http://sparql.bioontology.org/examples points to sample

Having a bit of a blunder around, you can see a few preflabels/altlabels being created. http://sparql.bioontology.org/?query=PREFIX+rdfs%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0D%0APREFIX+snomed-term%3A+%3Chttp%3A%2F%2Fpurl.bioontology.org%2Fontology%2FSNOMEDCT%2F%3E%0D%0APREFIX+skos%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2004%2F02%2Fskos%2Fcore%23%3E%0D%0ASELECT+%3Fx+%3Flabel+%3Falt_label+%0D%0AFROM+%3Chttp%3A%2F%2Fbioportal.bioontology.org%2Fontologies%2FSNOMEDCT%3E+%0D%0AWHERE+%0D%0A%7B%0D%0A++++%3Fx+rdfs%3AsubClassOf+snomed-term%3A363664003+.%0D%0A++++%3Fx+skos%3AprefLabel++%3Flabel.%0D%0A++++%3Fx+skos%3AaltLabel++%3Falt_label.%0D%0A++++%0D%0A%7D&csrfmiddlewaretoken=d8973878a531afe1ca70bb35b17a82d3

I know it's in the data set, but I haven't looked closely - there's skos:closeMatch or similar relations to other terms; I just haven't teased out the exact queries yet.