Open alistairewj opened 6 years ago
Updated function here: https://github.com/MIT-LCP/mimic-omop/commit/e1f5b984f2a0b120591aa2eb09fe7514453d5bc7
Both extracts use looks_like_value
to check if it's a valid number (no matter the format). Also did a fix to pick up the number 1,000,000
here https://github.com/MIT-LCP/mimic-omop/commit/ed7896ad87dabdcdbca6a3c363dc884ab0816d40
I'll see about adding some tests for this before I close the issue..
great Not reviewed anything but be carefull: that function is also used for labs values. Then this might break things there, and we may use different function
Okay, let me check!
Hm I see no differences but the ETL seems broken for negative lab values - it drops the negative.
I was looking through the PRESCRIPTIONS -> DRUG_EXPOSURE ETL, and came across the use of the
extract_value
function here: https://github.com/MIT-LCP/mimic-omop/blob/68e521268137abe2f5f44fd03c104f48aa5b3d2f/etl/StandardizedClinicalDataTables/DRUG_EXPOSURE/etl.sql#L40The function seems to extract numbers from text - which is great - but it makes the assumption that either commas or decimals are used as the decimal separator, e.g. it allows for
1.05
and1,05
to both map to the same number. However this is a regional thing, and in MIMIC a comma separator is almost always used as a thousandths separator.For MIMIC, prescriptions, we see this is bad:
Returns:
I think there should be one function for European ETL and one function for non-European ETL. I am going to separate the functions and have
extract_value_decimal_sep
andlooks_like_value_decimal_sep
to be used with UK/American style separators. I'll fix this for prescriptions and leave this issue open because we should check the other ETLs for this bug.grep
tells us to look at:, extract_value(dose_val_rx) as quantity --extract quantity from pure numeric when possible
, extract_value(value) as value_as_number
, extract_value(value) as value_as_number
, extract_value(dilution_comparison) as value_as_number