AmsterdamUMC / AMSTEL

AmsterdamUMCdb conversion to the OMOP Common Data Model
https://amsterdamumc.github.io/AMSTEL/
Other
4 stars 0 forks source link

Oxygen saturation is both a fraction and a percentage #2

Open daplaci opened 5 months ago

daplaci commented 5 months ago

Hello,

Thanks a lot for sharing this ETL. I have a question regarding the concept_id 42869608 - Oxygen saturation [Pure mass fraction] in Blood.

The value_as_number is both a fraction and a percentage but the value is supposed to be 'Decimal mass fraction'. Is this because two different sources are mapping to the same concept ? I can see that one workaround would be filtering on the different unit_source_value but I wonder if instead they should map to separate concept_ids directly.

patrickthoral commented 5 months ago

Indeed, a minority of the values represent (most likely) percentage. The main cause of this is manual entry of values by providers either in the early days when no automatic lab interface was yet available or due to a technical problem. In fact, you can even see values that have to be wrong (> 100), most likely because the provider entered data belonging to another field (e.g. sodium or chloride) because these were printed on the same form by the blood gas analyzer.

In the original version of AmsterdamUMCdb these errors were easier to identify because you had two provider columns: One for the original (registeredby) provider and one for the provider that updated the record (updatedby). However, the patient data management system did not overwrite manually entered values by the lab system (provider 'Systeem'), even though the updatedby column was updated. The OMOP CDM only has one provider_id column. I will update the ETL to keep the original provider so it's easier to identify values that were created manually.

patrickthoral commented 4 months ago

@daplaci I've updated the ETL to allow the user to determine the source of the measurement value based on type_concept_id. Manually entered data will receive the 32817 (EHR) type_concept_id and interfaced data from the lab system will have 32856 (Lab)

As far as I know there is no standardized way the indicate a difference between manually entered data or that from an electronic interface in an OMOP CDM, but at least it's possible to recognize now.

    CASE
      WHEN n.islabresult = b'1' AND n.registeredby = 'Systeem' THEN 32856   -- Lab
      ELSE 32817    -- EHR
    END AS type_concept_id,