EHDEN / ETL-UK-Biobank

ETL UK-Biobank
https://ehden.github.io/ETL-UK-Biobank/
13 stars 4 forks source link

GP Clinical target domain #113

Closed MaximMoinat closed 3 years ago

MaximMoinat commented 3 years ago

Background

Currently we force the records to map to the Measurement domain (via the stem table).

For example READ code 2469.00 maps to the Condition O/E - Systolic BP reading;SNOMED;Condition. But this record can have a numeric value, which cannot be mapped to the Condition table.

Issue

We will get domain mismatches (condition concepts in the measurement table).

Potential solutions

MaximMoinat commented 3 years ago

According to CPRD these are the most frequent codes that map to condition domain, while having a value. These might have to be handled by overriding with SNOMED code.

'246..00', '242..00', '2E3..11', '246A.00', '2469.00', 636..00', '636..12', '62X..00', '636..11', '463..11', '463..00', '58E1.00', '58E2.00', '58E8.00', '58EJ.00', '58EK.00', '312J.00', '3AE..00', '312H.00', '2H9..00', '1511.00', '152..00', '22Q4.00'

MaximMoinat commented 3 years ago

In some cases, if value is missing or irrelevant, a value of 0 is given. This is also the highest frequency value in tpp_gp_clinical. See finding on code 'G801.' in #315 .

MaximMoinat commented 3 years ago

Another observation is that codes from covid19 gp clinical tables have a better domain mapping as the source is either SNOMED (EMIS), local codes (EMIS) or CTV3 (TPP). The custom CTV3 to SNOMED and EMIS to SNOMED mappings already maps to (mostly) the right domains. Same for the SNOMED, with exception for e.g. O/E - blood pressure reading.

MaximMoinat commented 3 years ago

Suggested solution for covid GP clinical (TPP, EMIS) mappings: We will ONLY override the target domain to be 'Measurement' of a record if:

In pseudocode, to be implemented for both emis and tpp gp_clinical to stem table:

domain_id = None
if value_as_number is not None and value_as_number != 0:
    domain_id = 'Measurement'
elif value_as_concept_id is not None:
    domain_id = 'Measurement'

Todo: