So, looking at the entries that do not get omop mapped succesfully, these are the ones with >1k counts after removing NAs:
ABBREVIATION UNIT COUNT
b-hkr osuus 2012018
b-erytrosyytit,tilavuusosuus osuus 20424
b-pvk+tkd,hkr osuus 6946
p-vrab-o estimate 4145
b-retind osuus 3665
b-erytrosyyttientilavuusosuus osuus 3643
u-ph-o N 2529
erytrosyytit,tilavuusosuus osuus 1659
osuus seems to be the problematic entry as it can be both percentage [0,100] or ratio [0,1].
This is the distribution of values for the osuus entries:
for b-hkr and b-erytrosyytit,tilavuusosuus there is mixed data but 99% + of the data is a ratio. For the others instead values are 100% coherent.
Thus, here is a suggested update for the unit_table
ABBREVIATION
UNIT
TARGET_UNIT
b-hkr
osuus
RATIO
b-erytrosyytit,tilavuusosuus
osuus
RATIO
b-pvk+tkd,hkr
osuus
RATIO
b-retind
osuus
%
b-erytrosyyttientilavuusosuus
osuus
%
erytrosyytit,tilavuusosuus
osuus
RATIO
Ideally, we can then add them to OMOP mapping and provide ratio-->% conversion or otherwise
So, looking at the entries that do not get omop mapped succesfully, these are the ones with >1k counts after removing NAs:
osuus
seems to be the problematic entry as it can be both percentage [0,100] or ratio [0,1].This is the distribution of values for the osuus entries:
for
b-hkr
andb-erytrosyytit,tilavuusosuus
there is mixed data but 99% + of the data is a ratio. For the others instead values are 100% coherent.Ideally, we can then add them to OMOP mapping and provide ratio-->% conversion or otherwise