Closed vincent-octo closed 1 month ago
Just one thing: for consistency with the input columns, I kept MEASUREMENT_STATUS because in the original column is grouped with the other ones:
'tutkimustulosarvo': 'MEASUREMENT_VALUE',
'tutkimustulosyksikko': 'MEASUREMENT_UNIT',
'tutkimusvastauksentila': 'MEASUREMENT_STATUS',
'tuloksenpoikkeavuus': 'RESULT_ABNORMALITY',
We could discuss about changing the column names, as in my opinion some are a bit confusing at the moment. The idea is to be more consistent in the naming, and avoid implicitness.
The suggested renaming would be based on — and differentiate between — these concepts:
TEST_
MEASUREMENT_
RESULT_
This would mean we will not have the same column names as in Kira's output, but that's OK.
Suggested column renaming:
LAB_DATE_TIME
=>MEASUREMENT_DATE_TIME
Would probably need to confirm the datetime here is about the measurement, and not about when the result arrived in Kanta system for example.LAB_SERVICE_PROVIDER
=>SERVICE_PROVIDER
Though the variale for this might be missing in the FinnGen raw data.LAB_ID
=>TEST_ID
LAB_ID_SOURCE
=>TEST_ID_SOURCE
LAB_ABBREVIATION
=>TEST_NAME_ABBREVIATION
LAB_VALUE
=>MEASUREMENT_VALUE
LAB_UNIT
=>MEASUREMENT_UNIT
LAB_ABNORMALITY
=>RESULT_ABNORMALITY
~
MEASUREMENT_STATUS
=>RESULT_STATUS
~ (keepMEASUREMENT_STATUS
, see https://github.com/FINNGEN/kanta_lab_preprocessing/issues/15#issuecomment-2112011424)REFERENCE_VALUE_TEXT
=>TEST_REFERENCE_VALUE
So in the end, we would change from this header:
to this header: