FINNGEN / kanta_lab_harmonisation_public

https://finngen.github.io/kanta_lab_harmonisation_public/
MIT License
0 stars 0 forks source link

osuus #1

Closed piotor87 closed 2 months ago

piotor87 commented 2 months ago

So, looking at the entries that do not get omop mapped succesfully, these are the ones with >1k counts after removing NAs:

ABBREVIATION    UNIT    COUNT
b-hkr   osuus   2012018
b-erytrosyytit,tilavuusosuus    osuus   20424
b-pvk+tkd,hkr   osuus   6946
p-vrab-o    estimate    4145
b-retind    osuus   3665
b-erytrosyyttientilavuusosuus   osuus   3643
u-ph-o  N   2529
erytrosyytit,tilavuusosuus  osuus   1659

osuus seems to be the problematic entry as it can be both percentage [0,100] or ratio [0,1].

This is the distribution of values for the osuus entries: image

for b-hkr and b-erytrosyytit,tilavuusosuus there is mixed data but 99% + of the data is a ratio. For the others instead values are 100% coherent.

Thus, here is a suggested update for the unit_table ABBREVIATION UNIT TARGET_UNIT
b-hkr osuus RATIO
b-erytrosyytit,tilavuusosuus osuus RATIO
b-pvk+tkd,hkr osuus RATIO
b-retind osuus %
b-erytrosyyttientilavuusosuus osuus %
erytrosyytit,tilavuusosuus osuus RATIO

Ideally, we can then add them to OMOP mapping and provide ratio-->% conversion or otherwise