OHDSI / Themis

Repository for OMOP CDM conventions as defined by THEMIS. These can be reference lists of concepts, pieces of standardized code for data generation or quality certification, and debates.
Apache License 2.0
25 stars 8 forks source link

Can we populate CONDITION_OCCURRENCE.condition_end_date with another date or '2999-12-31' if don't have this information #173

Closed jiawei-qian closed 2 months ago

jiawei-qian commented 2 months ago

Can we populate CONDITION_OCCURRENCE.condition_end_date with another date or '2999-12-31' if don't have this information

CDM or THEMIS convention?

THEMIS

Table or Field level?

Field

Is this a general convention?

No

Summary of issues

  1. In populating the condition_occurrence table, we’ve been trying to figure out how to best determine accurate condition_periods. Basically, we’re trying to determine the best way to populate the condition_end_date column of the condition_occurrence table.

  2. Is there any reason that I cannot use ‘2999-12-31’ instead of NULLS for CONDITION_OCCURRENCE.condition_end_date?

Summary of answer

  1. Christian: _You do - nothing. Unless you have specific information when a condition ended (was resolved, the infection healed or anything like that) you leave the condition_end_date empty._ That way, it becomes a snapshot diagnosis or recording at that time. The CONDITION_ERA table is what will try to infer the actual duration of a condition that is recorded multiple times over time. It uses a standardized algorithm for doing that.

  2. Chris Knoll: _Everyone has an observation_period, and things shouldn’t extend past their observation period._ Taking your example with the end date of 2999-12-31, what if you ask your data ‘What’s the average duration of condition X’ and some of the people have 1700 years of duration…isn’t that a problem? And the statement of ‘ongoing’ is also probelmatic…if you use getdate() as a fill-in for null end dates for purposes of establishing end dates, then as time goes on, the duration of the condition gets longer and longer beacuse as you move forward in ‘reality time’, the getdate() becomes later, and you get different values of duration (start date → getdate()).

  3. Melanie Philofsky: _I wouldn’t put in a condition_end_date unless your data specifically states a condition has stopped._ The condition_end_date is not mandatory and most conditions do not have a known end date for 2 reasons. 1) Chronic conditions are usually life long events: high blood pressure, diabetes, asthma, etc. 2) If the condition is not chronic (sprained wrist, influenza, etc.), the Person doesn’t visit the medical institution to declare their disease free state.

Related links

OHDSI Forum (Calculating condition periods (condition_end_date)): https://forums.ohdsi.org/t/calculating-condition-periods-condition-end-date/737/2 OHDSI Forum (CONDITIONEND DATE as ‘2999-12-31’ instead of NULL?): https://forums.ohdsi.org/t/condition-end-date-as-2999-12-31-instead-of-null/18683/4 OHDSI Forum (Patient-Reported Drugs and Conditions): https://forums.ohdsi.org/t/patient-reported-drugs-and-conditions/3152/33

Other comments/notes

There are 3 OHDSI Forum posts mentioned the condition_end_date issue. I captured the full answers from 3 people. All have their own good ideas on that the questions. Can choose the good points from these answers.

clairblacketer commented 2 months ago

Thanks @jiawei-qian, since condition_end_date is not required we don't want to give any imputation strategies because it implies that the date needs to be filled.