Improving column names - Githubissues

We could discuss about changing the column names, as in my opinion some are a bit confusing at the moment. The idea is to be more consistent in the naming, and avoid implicitness.

The suggested renaming would be based on — and differentiate between — these concepts:

a lab test, column prefix: TEST_
a measurement for a lab test, column prefix: MEASUREMENT_
a result for a lab test, column prefix: RESULT_

This would mean we will not have the same column names as in Kira's output, but that's OK.

Suggested column renaming:

LAB_DATE_TIME => MEASUREMENT_DATE_TIME Would probably need to confirm the datetime here is about the measurement, and not about when the result arrived in Kanta system for example.
LAB_SERVICE_PROVIDER => SERVICE_PROVIDER Though the variale for this might be missing in the FinnGen raw data.
LAB_ID => TEST_ID
LAB_ID_SOURCE => TEST_ID_SOURCE
LAB_ABBREVIATION => TEST_NAME_ABBREVIATION
LAB_VALUE => MEASUREMENT_VALUE
LAB_UNIT => MEASUREMENT_UNIT
LAB_ABNORMALITY => RESULT_ABNORMALITY
~MEASUREMENT_STATUS => RESULT_STATUS~ (keep MEASUREMENT_STATUS, see https://github.com/FINNGEN/kanta_lab_preprocessing/issues/15#issuecomment-2112011424)
REFERENCE_VALUE_TEXT => TEST_REFERENCE_VALUE

So in the end, we would change from this header:

FINREGISTRYID   LAB_DATE_TIME   LAB_SERVICE_PROVIDER    LAB_ID  LAB_ID_SOURCE   LAB_ABBREVIATION    LAB_VALUE   LAB_UNIT    LAB_ABNORMALITY REFERENCE_VALUE_TEXT    MEASUREMENT_STATUS

to this header:

FINREGISTRYID   MEASUREMENT_DATE_TIME   SERVICE_PROVIDER    TEST_ID TEST_ID_SOURCE  TEST_NAME_ABBREVIATION  MEASUREMENT_VALUE   MEASUREMENT_UNIT    RESULT_ABNORMALITY  TEST_REFERENCE_VALUE    RESULT_STATUS

FINNGEN / kanta_lab_preprocessing

Improving column names #15