data2health / ehr2HPO.prj

Conversion of EHR data (such as LOINC) to HPO codes
5 stars 0 forks source link

Analysis of reasons for mapping failures on real data #8

Closed pnrobinson closed 4 years ago

pnrobinson commented 5 years ago

Can we make a list of LOINC terms that were not mapped at one real-life CTSA, and look at the most common ones to identify any potential systematic reasons for failure? That is, if we are expecting the wrong codes or something like that, then the failure might not be because of incomplete curation (we know about that) but because of a logic problem in our code that we are not aware of. This analysis will try to figure that out by looking at the top 250 most common failures. If possible, it would be good to share this in a de-identified way with the entire team.

cgchute commented 5 years ago

I am adding Richard Zhu @richard1933 to this discussion. He oversaw our HPO conversions for clinical profiles.

kingmanzhang commented 4 years ago

Some initial result from OHSU. Top LOINC not yet annotated:

loinc_code marked_uninterpretable lab_count
26453-1 N 522509
28542-9 N 521789
8251-1 N 519469
30385-9 N 515670
30392-5 N 440305
58413-6 N 440281
1783-0 N 325244
8251-1 N 280741
8251-1 N 273425
38518-7 N 204944
51584-1 N 204787
10331-7 N 85944
883-9 N 85832
1743-4 N 68978
43396-1 N 56723
11277-1 N 56655
5767-9 N 53495
55368-5 N 49876
2546-0 N 49166
41284-1 N 48584
25148-8 N 48494
33393-0 N 48324
5822-2 N 48305
8251-1 N 48252
46137-6 N 48237
46138-4 N 48236
38995-7 N 48179
8251-1 N 48129
8251-1 N 47768
33905-1 N 46685
8251-1 N 46138
81178-6 N 45331
34574-4 N 44362
5799-2 N 42631
6777-7 N 38395
38892-6 N 31702
50220-3 Y 27957
50220-3 Y 27820
49541-6 N 27054
20507-0 N 25993
32721-3 N 24607
56888-1 N 24168
11039-5 N 23759
743-5 N 23642
732-8 N 23641
38518-7 N 23639
51584-1 N 23610
30424-6 N 23006
3040-3 N 21659
61151-7 N 20460
1744-2 N 19717
81178-6 N 18342
11253-2 N 18270
2026-3 N 18167
8251-1 N 18000
46425-5 N 18000
8251-1 N 18000
11580-8 N 17975
16128-1 N 17921
50984-4 N 17043
5195-3 N 16603
32684-3 N 15512
19145-2 N 14858
34652-8 N 14819
8251-1 N 14369
5195-3 N 14208
50989-3 N 13816
43403-5 Y 12636
43404-3 Y 12631
43304-5 Y 12542
43305-2 Y 12527
60026-2 N 11819
19295-5 N 11165
19642-8 N 11165
50560-2 N 10934
19244-3 N 10795
9830-1 N 10452
kingmanzhang commented 4 years ago

transferred to loinc2hpoAnnotation. Consider this as done.