National-COVID-Cohort-Collaborative / Data-Ingestion-and-Harmonization

Data Ingestion and Harmonization
41 stars 12 forks source link

Preferred Units for Lab test #19

Open vojtechhuser opened 4 years ago

vojtechhuser commented 4 years ago

relevant links shared during the webinar

units https://github.com/vojtechhuser/ThemisConcepts/blob/master/extras/results2019/S7-preferred_units-ABC.csv

Thresholds (min max) https://github.com/vojtechhuser/DataQuality/tree/master/extras/DqdResults

Per Chris Chute - The units must be also harmonized FHIR US core (US context only)

hlehmann17 commented 4 years ago

Vojtech –

As always, great to hear from you.

I now I should put the following on Git, but It’s too broad for a single issue----As you heard, I am anticipating some friction between what the analysts expect and what we think we’re supposed to deliver. For instance, it would seem to me that the first thing the analysts do is “clean” the data.

Now, in your experience, can you articulate what sort of such friction you have bumped in to? I think in terms of high levels, but I know you like to get into the weeds, so feel free to tell me about some classic “weeds.”

H

From: Vojtech Huser notifications@github.com Reply-To: National-COVID-Cohort-Collaborative/Data-Ingestion-and-Harmonization reply@reply.github.com Date: Wednesday, May 20, 2020 at 11:41 AM To: National-COVID-Cohort-Collaborative/Data-Ingestion-and-Harmonization Data-Ingestion-and-Harmonization@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [National-COVID-Cohort-Collaborative/Data-Ingestion-and-Harmonization] Preferred Units for Lab test (#19)

relevant links shared during the webinar

units https://github.com/vojtechhuser/ThemisConcepts/blob/master/extras/results2019/S7-preferred_units-ABC.csv

Thresholds (min max) https://github.com/vojtechhuser/DataQuality/tree/master/extras/DqdResults

Per Chris Chute - The units must be also harmonized FHIR US core (US context only)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/National-COVID-Cohort-Collaborative/Data-Ingestion-and-Harmonization/issues/19, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEBMMJXZWPVK744WPSAS3Z3RSP2XXANCNFSM4NGBRBBQ.

pambanning commented 4 years ago

Hello, Just trying to get my feet wet in this project. Apologies in advance if this is misdirected.

  1. Are you tying preferred units to the analyte/measurement from the lab or to the usage of the LOINC or SNOMED CT code? Asking because there are a handful that appear to be using the LOINC long common name as a display (which indicates property), and the units do not belong to that IUPAC property: (you'll see the inversion of [mass/mol] against units) concept_code.y | concept_name.y Alpha-1-Fetoprotein [Units/volume] in Serum or Plasma | ng/mL Ammonia [Mass/volume] in Plasma | umol/L Calcium [Moles/volume] in Serum or Plasma | mg/dL Calcium.ionized [Mass/volume] in Blood | mmol/L Creatinine [Moles/volume] in Blood | mg/dL Homocysteine [Mass/volume] in Serum or Plasma | umol/L Lactate [Mass/volume] in Blood | mmol/L Methylmalonate [Mass/volume] in Serum or Plasma | nmol/L Phosphate [Moles/volume] in Serum or Plasma | mg/dL Pyridoxine [Mass/volume] in Serum or Plasma | nmol/L Sex hormone binding globulin [Mass/volume] in Serum or Plasma | nmol/L Thiamine [Mass/volume] in Blood | nmol/L Urobilinogen [Units/volume] in Urine by Test strip | mg/dL Vascular cell adhesion molecule 1 [Mass/volume] in Serum or Plasma | ug/mL

  2. There were 4 SNOMED CT concepts listed that I take more to be content headers than actual measurements. The generalization of one unit of measure to these might not be helpful:

concept_code.y | concept_name.y | vocabulary_id.x | concept_code.x | concept_id Blood chemistry | mmol/L | millimole per liter | SNOMED | 166312007 Enzyme measurement | ng/mL | nanogram per milliliter | SNOMED | 122444009 Measurement of liver enzyme | [U]/L | unit per liter | SNOMED | 269856004 Urine ratio | g/mol | gram per mole | SNOMED | 394953004

I would be interested in seeing raw data that these 4 are coming in with patient values.

  1. The two Helicobacter serologies are an immune assay method, index value (IV) is a common quantitative unit. I haven't ever experienced eV (electron Volt) as an outgoing laboratory unit of measure. This seems to be a physics unit of measurement for energy. Would ask for that to be clarified.

  2. Aside from amino acids, acylcarnitines, fatty acids, the US laboratory chemistries haven't used much of the molar units of measure, so the lipid measurements in here going to mmol's surprised me.

  3. There were pairs of LOINCs and SNOMEDs that matched units perfectly. I presume there's multiple lines in the file to hold a unique terminology code:

concept_code.y concept_name.y vocabulary_id.x concept_code.x concept_id
Alanine aminotransferase serum/plasma [U]/L unit per liter LOINC 1742-6
ALT - blood measurement [U]/L unit per liter SNOMED 250637003
Creatinine renal clearance in 24 hour mL/min milliliter per minute LOINC 2164-2
Measurement of renal clearance of creatinine mL/min milliliter per minute SNOMED 167181009

However, other "pairings" didn't use the same units of measure:

Random blood glucose measurement | mmol/L | millimole per liter | SNOMED | 271061004 Glucometer blood glucose | mmol/L | millimole per liter | SNOMED | 166900001 Glucose [Mass/volume] in Blood | mg/dL | milligram per deciliter | LOINC | 2339-0 Glucose lab | mg/dL | milligram per deciliter | LOINC | 2345-7 Glucose measurement, 2 hour post prandial | mmol/L | millimole per liter | SNOMED | 88856000

I don't believe it's necessary to keep SNOMED terms tied to molar units; if indeed all submitted labs provide glucose in mg/dL

  1. I have a few ideas on how to vet the file and units against the LOINC database, IF you are choosing preferred units based on the LOINC term, but please advise if this is helpful before I proceed? As mentioned above, I don't believe SNOMED CT is prescriptive to molar/SI units only.

Looking forward to the interaction.

DaveraGabriel commented 4 years ago

Hello @vojtechhuser @hlehmann17 @pambanning - thank-you for logging and keeping this issue updated. I am adding a "map validation" label to this altho this did not arise like the others as a result of our validation sessions. However, I think this belongs in the queue we will be working after initial pipeline deployment as we refine the DI&H process to increase the quality & fidelity of our data... do you agree? if not: happy to remove the label so it doesn't flow into that queue.

If it is acceptable: can we queue this topic up to discuss in one of our end-of meeting discussion on Wednesdays? I would want to put that on the agenda on a date when all three of you are present. Is this possible?:

Thanks again for your contributions!