EHDEN / ETL-UK-Biobank

ETL UK-Biobank
https://ehden.github.io/ETL-UK-Biobank/
12 stars 4 forks source link

Improve ICD10 mapping via higher ICD10 level #331

Closed MaximMoinat closed 2 years ago

MaximMoinat commented 2 years ago

There are some source ICD10 codes that cannot be found in the OMOP vocabulary. This might be a UK modification. We should map these via the higher level ICD10 code.

For example: Y05.3 should map via ICD10:Y05 (Sexual assault by bodily force) to SNOMED: 248110007 (Sexual assault)

We can reuse scripts used for Swedish ICD10 dialect.

MaximMoinat commented 2 years ago

Likely for the hesin_diag table.

Although baseline field 41270 also has icd10 coded values, but this field is not mapped.

MaximMoinat commented 2 years ago

We solved this in Caliber mapping already: https://github.com/thehyve/ohdsi-etl-caliber/blob/master/sql/functions/mapIcdCode.sql

SofiaMp commented 2 years ago

Currently the script handling the hesin_diag table includes a function to handle special ICD10 formats. Tested the Y05.3 code and it seems to be working as expected. @vpapez could you please let me know where you encountered these codes missing?

vpapez commented 2 years ago

Currently the script handling the hesin_diag table includes a function to handle special ICD10 formats. Tested the Y05.3 code and it seems to be working as expected. @vpapez could you please let me know where you encountered these codes missing?

Oh, then this is my bad. I've done a check on a mapping coverage by searching the used source ICD10 codes in the OMOP Vocab table where vacabulary_id='ICD10'. This does not work here, as Y059 is not in the OMOP vocab table. But I had no clue, that this function of yours is implemented in the mapping process and indeed, all source records containing Y050, Y054, Y055, and Y059 are correctly mapped into a Condition occurrence table onto 4093699 concept ID (standard mapping for Y05). This is great in terms of converted data! I'm just wondering now, how to correctly calculate the mapping coverage with respect of the results of the function, but that is not an issue for this task. Thank you for correcting me!