pavgra commented 6 years ago

Today OMOP's death table contains information about a person's death_date and a cause of death. Discussion in http://forums.ohdsi.org/t/condition-occurrence-death-diagnoses/2609/52 figures that there may be a need for storing several causes of death per person. This means a need to have multiple records. It's clear that just storing multiple records in the death table using its current structure is not the best idea: that would bring redundancy and possible ambiguity, because of multiple death dates per a person. That's why it is reasonable to move the death date, since it's a one time thing per person, in the person table, and leave the death table for just storing the causes of death. But if we go forward, we can see that resulting death table basically stores all info presented in a condition_occurrence table, and what is more, logically, cause of death is a condition. That's why second step of the proposal is to replace the death table by storing causes of death in condition_occurrence table.

cgreich commented 6 years ago

@pavgra:

So, what's your proposal?

Remove the DEATH table
Move death_date and death_datetime to PERSON
Merge death_type_concept_id into condition_type_concept_id of the CONDITION table
Change the Death Type concepts into Condition Type concepts
Merge cause_concept_id, cause_value and cause_source_concept_id into condition_concept_id, condition_source_value and condition_source_concept_id of the CONDITION table, respectively (since they have to be in the Condition domain anyway)

That way, we know a person died, and the causes we can get by filtering for death related Condition Type Concepts.

Sounds clean, with one exception: You need to read the Condition Type Concepts in order to decide they are causes of death. There is no modeling mechanism to make this clear.

clairblacketer commented 6 years ago

This sounds great, I have added it to the agenda for the December meeting

gklebanov commented 6 years ago

@pavgra / @cgreich - great proposal, it will really make it much cleaner.

Except for #2- why do we need to "move death_date and death_datetime to PERSON" if we can just use condition_start_date and condition_start_datetime in CONDITION_OCCURRENCE table to specify when death occurred? never mind there can be multiple valid death records.

With this approach - no modification to PERSON at all

cgreich commented 6 years ago

@gklebanov:

True. That would be even cleaner. Except that we really hate different death dates, because then the analytic has to pick the right one. Since this is an artifact of the data (unless you are a cat you can't die more than once) it should really be dealt with by the ETL.

Let's see what the community has to say.

clairblacketer commented 6 years ago

@pavgra @gklebanov

I think this is a great discussion and I would like to bring it up at our 9/4 CDM meeting as it is a perfect candidate for CDM v6.0. Can you flesh this out into a full proposal? I have a link here for a proposal template - not all of it may apply since you are requesting to remove a table but it can at least get you started.

clairblacketer commented 6 years ago

from 9/4/ meeting, proposal:

Remove DEATH table
Use the CONDITION_CONCEPT_ID for death (SNOMED)
Use the CONDITION_TYPE_CONCEPT_ID to denote a condition is a cause of death in the CONDITION table a. Take the concepts in the 'Death Type' vocabulary and put them in the 'Condition Type' Vocabulary b. Concept_class_id will stay as 'Death Type'

The proper vocabulary needs to be specified on how to store cause of death, this would be: WHERE vocabulary_id = 'Condition Type' and concept_class_id = 'Death Type'

aostropolets commented 6 years ago

How would you store the death with no cause then?

clairblacketer commented 6 years ago

I think the idea is that for each record of death, there would be a record in CONDITION_OCCURRENCE for the death itself and then a record for the cause of death. @cgreich is that right?

vojtechhuser commented 6 years ago

https://github.com/OHDSI/CommonDataModel/wiki/DEATH

types http://athena.ohdsi.org/search-terms/terms?conceptClass=Death+Type&page=1&pageSize=15&query=

pavgra commented 6 years ago

@clairblacketer, why would you need a separate record for the fact of death? can a death happen w/o a cause?

cgreich commented 6 years ago

Yes. No cause means you only have a record with SNOMED 4306655 Death.

pavgra commented 6 years ago

@cgreich, can a death happen w/o a cause? why would you need to put records for both death fact and cause of death, when record for the cause of death indicates fact of death and not requires duplicative record.

cukarthik commented 6 years ago

I like how the proposal better incorporates cause of death, but I'm not sure I'm a fan of putting the death date in the condition occurrence as opposed to the person table. To me, this is the same as saying why don't we move the date of birth into the condition table and use the condition concept id 4083587 (DOB). Sorry I missed the call today, so I might be rehashing this 😄

cgreich commented 6 years ago

@pavgra:

A cause of death is a Condition. Usually one so bad that as a consequence the patient dies. But from a philosophical perspective such cause of death is no different than any other condition. The death itself is also a condition, even though you are unlikely to recover from it (unless you are Jesus). So, putting both into the Condition table is legitimate. The problem now is how to connect the death to its cause. We could have used FACT_RELATIONSHIP (ugly), or the Condition Type (formerly Death Type). That's a not 100% clean from a modelling perspective, but practicable.

Why do we need the death record at all? Because we don't always have a cause. Actually, most of the time we don't.

cgreich commented 6 years ago

@cukarthik:

You are rehashing. We debated that up and down.

You are correct, a birth date and a death date is a symmetrical feature of each person. But from our observational data these are very different. Here is why: the birth date, like the gender and race, are modeled as fixed entities. They don't change or happen over time. While death is like any other event table: It happens at some point in time. You could argue that:

Race and Gender can change, or might have to be corrected after a mistake. That's true, but in our model they are assumed static.
Birth day is an event with a date. That is true, but not in our model. We don't have records in the PERSON table before they were born, In our model, there are no people without a birth date. But there are plenty of people who are not dead, and then they die. So, the former is de-facto static, the latter dynamic.

aostropolets commented 6 years ago

For each death with no cause we need to put something like 441413 Death of unknown cause then. It's an observation though, so need to change the domain as well

djb2188 commented 6 years ago

@christian or @anna

-We need to identify death from condition and observation ?

searching across 2 domains seems inefficient but proper. Proper often contends with functionality.
No death date in person?

Best,

David J Blatt

On Sep 4, 2018, at 9:10 PM, Anna Ostropolets notifications@github.com<mailto:notifications@github.com> wrote:

For each death with no cause we need to put something like 441413 Death of unknown cause then. It's an observation though, so need to change the domain as well

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_OHDSI_CommonDataModel_issues_210-23issuecomment-2D418566421&d=DwMCaQ&c=G2MiLlal7SXE3PeSnG8W6_JBU6FcdVjSsBSbw6gcR0U&r=EL1ISsYkpOQ9K6I-jMuKTM9lGx29hFS41hpO5VojRrc&m=ShLYZGrpNWB6Q2fWYQTt1Bo1T5P6Nj5Q0PV0qXZUzHQ&s=grmgIMbedt-Zjkn2QzkkDJTdSAM6EZ6D7czy4SjjNeI&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AbK7NYFnkCckr-2DXgakQ6xBGXstCO24qRks5uXyR5gaJpZM4WRu1r&d=DwMCaQ&c=G2MiLlal7SXE3PeSnG8W6_JBU6FcdVjSsBSbw6gcR0U&r=EL1ISsYkpOQ9K6I-jMuKTM9lGx29hFS41hpO5VojRrc&m=ShLYZGrpNWB6Q2fWYQTt1Bo1T5P6Nj5Q0PV0qXZUzHQ&s=M5GtXBH7vI0mkvpDxX0dgZNEK1SL11WXeonHQC2BBss&e=.

pavgra commented 6 years ago

@cgreich, Anna brought up the point which I tried to formulate: there cannot be a death w/o a cause. At least, it's unknown cause but it is. And then a separate redundant record for death is not required, it's ambiguous.

aostropolets commented 6 years ago

David, no death in Person table. Christian wrote a whole book above explaining why :) I I wouldn't say that we should make life harder for people and make them search across 2 tables. We'd rather change the domain and everything will be in condition_occurrence

djb2188 commented 6 years ago

Ah ok. The domain shift makes sense. I must read his book.

Best,

David J Blatt

On Sep 4, 2018, at 10:01 PM, Anna Ostropolets notifications@github.com<mailto:notifications@github.com> wrote:

David, no death in Person table. Christian wrote a whole book above explaining why :) I I wouldn't say that we should make life harder for people and make them search across 2 tables. We'd rather change the domain and everything will be in condition_occurrence

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_OHDSI_CommonDataModel_issues_210-23issuecomment-2D418574784&d=DwMCaQ&c=G2MiLlal7SXE3PeSnG8W6_JBU6FcdVjSsBSbw6gcR0U&r=EL1ISsYkpOQ9K6I-jMuKTM9lGx29hFS41hpO5VojRrc&m=eaFVgMOu8_f_rk3pZyIsshx51V5LmGJbr9i1TV6vgb0&s=AM9mpGJVu9Se4QkYsAye8qQl62kZx8tEAtAh9nhAfqQ&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AbK7NXqvaVtc07DqTVY-5FRvTPM877UobAks5uXzBygaJpZM4WRu1r&d=DwMCaQ&c=G2MiLlal7SXE3PeSnG8W6_JBU6FcdVjSsBSbw6gcR0U&r=EL1ISsYkpOQ9K6I-jMuKTM9lGx29hFS41hpO5VojRrc&m=eaFVgMOu8_f_rk3pZyIsshx51V5LmGJbr9i1TV6vgb0&s=vOoPP5I9NzhVuTbZVKM8_H9s74wQtWuaqDA1nmOxLI0&e=.

burrowse commented 6 years ago

I'm sorry I missed the call yesterday as I had a conflicting meeting.

In PEDSnet, we adapted the death to accommodate multiple causes of death. We added a "death_cause_id" as the primary key for the table. At this time, it was the least disruptive to capture this data for the network (adding a column versus removing a table).

We also use the "concept_class_id ='Death Type" convention for the death_type_concept_id and our causes are SNOMED Codes.

It may be worth mentioning, that we also added a "death_impute_concept_id" to the table as we have cases where the date of death is estimated for various reasons.

cgreich commented 6 years ago

@pavgra:

You are getting philosophical. You are saying there is no death without a cause? Meaning, life is the default, and no life is an exception, that only happens if there is a reason? I would claim the opposite, actually. :)

Three reasons why we needn't enforce the cause of death:

We rarely get it when there is one. We are already are euphoric when we get the death itself. Most databases are bad at providing this information.
There are people who die of what's called "old age". They just die in their sleep without any acute health issue. The heart just stops beating. No matter how much you want to jump up and down, you won't get a cause of death.
In the OMOP CDM, like in all other observational databases with the exception of surveys, we don't capture negative or explicitly unknown facts. So, just a record of death, without any other condition records with a Condition Type where concept_class_id='Death Type' is sufficient to capture that fact.

cgreich commented 6 years ago

@burrowse:

Funny you mention that. The whole debate started with the proposal of adding another cause_of_death table, because in contrast to death itself there can be more than one. And then folks came up with the brilliant (in my mind) idea that we don't need any of these tables:

Death is just another Condition event (or, some might argue, demographic fact), and cause of death is definitely a Condition event. So, why having this table altogether?

There are only two use cases you need the death table for: (i) mortality of an intervention and (ii) cause of death characterization. Either one can be done in the Condition table no problem. Or do you have another use case?

vojtechhuser commented 6 years ago

As you formalize the final guidance, please design Achilles Heel rules (a SQL query) and what you expect to see. One rule is - 'enforce single date of death' in CDM. (current rule is (if we expect one row only) select person_id, count(*) as rows from death and this count must be <=1. and similar rule for select person_id, count(*) from (select distinct person_id, death_date from death)a ....

pavgra commented 6 years ago

@cgreich , no philosophy, just trying to turn things to a cleaner way. So why a dedicated record for death in the conditions table is going to be better than a solution proposed by @aostropolets?

For each death with no cause we need to put something like 441413 Death of unknown cause

If you use the approach, there remains the one and only way to describe deaths and no need to repeat any data.

pbr6cornell commented 6 years ago

I have decided to vote to support this proposal. I admit it wasn't something I was originally warm to, but as I mulled it over, I think I have convinced myself that this has strong conceptual and pragmatic arguments. I thought I would share my logic for those who still may also be on the fence:

So, the CDM has a few basic principles in its organization. One is that we only create domain tables if 1) the domain has a justified analytic use case, and 2) the domain has domain-specific attributes that are needed to support the analytic use case.

When we initially designed DEATH it was because 1) we definitely want to do mortality studies, and 2) cause-of-death was a recognized domain-specific attribute that could support work.

This proposal really starts with the experience of some data partners that there is not an explicit one-to-one relationship between death and cause-of-death, and we don't want 'information loss' by not preserving whatever cause-of-death source records exist. Given that we must move cause-of-death out of the DEATH table to somewhere else, that leaves the DEATH table now without any domain-specific attributes, making it now violating our basic principle.

So, that leaves us with 2 decisions to make: where to put the 'cause-of-death' information, and where to put 'death' information.

I like the proposal of 'cause-of-death' being a CONDITION_OCCURRENCE. I think we do need to add a CONDITION_TYPE_CONCEPT_ID for 'cause of death', in case that's the explicit provenance. But inherently, cause of deaths are diseases. Recall that while ICD is now 'International Classification of Diseases', it started as 'International List of Causes of Death'. The fields currently in the CONDITION_OCCURRENCE should nicely capture the information for 'cause-of-death', and the standard vocabulary concepts for the condition domain will give us complete coverage of what we need.

I also like the proposal for 'death' being a CONDITION_OCCURRENCE. This allows us to maintain provenence (CONDITION_TYPE_CONCEPT_ID can distinguish if its a death record from hospital discharge vs. death registry record vs. patient status indicator). It also allows for using the same condition vocabulary concepts, which offer the possibility of provided greater granularity to the death, if its available. One thing that I would argue we'd need to do as a community is make sure that we have the appropriate domains assigned to all relevant 'death' concepts in the vocabulary (e.g. currently, the SNOMED concept of 'death' has the domain of 'observation'). It may also be desirable to review the CONCEPT_ANCESTOR records around death, so that there is appropriate hierarchical coverage between the top parent of 'Death' and all the associated more specific flavors of death, but I don't see these relationships being complete at this point. Putting 'death' records in CONDITION_OCCURRENCE relaxes the criteria that there be only one death date, but provides extra flexibility in case an analyst wants to select death based on provenance or some other logic.

I initially thought putting 'death date' in the PERSON table, but I think there's several good arguments against this counter-proposal: 1) this would mean we'd 'lose' the provenance of where death record came from, 2) this would not provide the flexibility for multiple records, 3) revising a dataset with new information would require UPDATE TO PERSON rather than INSERT INTO CONDITION, which is slightly cludgier, and 4) finding death would be based on SELECT FROM PERSON WHERE DEATH_DATE IS NOT NULL instead of SELECT FROM CONDITION_OCCURRENCE WHERE CONDITION_CONCEPT_ID IN <concepts for 'death'> ...... (basically we'd be looking for the absence of death date, instead of the presence of death records).

If this proposal is ratified, I'd propose we create a conceptset for 'death' and cohort definition for 'persons with death' that can be readily re-used across the community of CDM v6-compliant datasets. That'll go a long way to operationalizing this proposal into a workable solution for the analytic use cases of interest.

baileych commented 6 years ago

I can support the logic behind the proposal pretty readily. There are a couple places where I'm uncertain, and maybe just need clarification:

In practice, it's important to just represent the concept "died". Not with known cause, not with unknown cause, not with multiple causes - all we know is "died". That's just how the data come to us; the NULL problem is there. Structurally, it could be that this unadorned death concept is one of the possible values for a "cause of death" condition, of course.
Following @cgreich down the road of (philosophical) ontology, death feels to me more like a state that is observed than a pathologic process. Diseases (or injuries or errors or whatever) cause death; it's not a disease. So it seems more natural to locate it in observation than condition_occurrence. But I'm persuasible.
One of the other attractions of observation is that we have an analytic need (cough PCORnet cough) for representing imputation/confidence in death, because it's often coming from different sources than the rest of diagnosis data. In observation, there's qualifier_concept_id handy for this kind of stuff. In condition_occurrence, we could of course have a set of condition_concept_ids or condition_type_concept_ids such as "Death, observed", "Death, vital statistics", "Death, notification", "Death, imputed"; we'd just need to search for one of the set. That's got existing analogues (primary or secondary diagnosis, etc.), but feels a little more kludgey than klugey. 😄

hripcsa commented 6 years ago

It seems to me that we started with the desire for multiple records in the death table with the undesirable consequence that the dates could conflict. And we ended up with multiple records in the condition table with the undesirable consequence that the dates could conflict. So we just got rid of a table with no other progress. When we have conflicting information about the birth date, we pick one and put it in the person table, and have the option to put the conflicting information into the observation table. That is, we usually force you to make it easier for the researcher, and come up with a work around to avoid losing information.

Semantically, death is not a condition. E.g., what if the death condition is given an end date? Were they resuscitated, or is it an error, or do we ignore it. And what goes into the condition_era table for death? If anything, "life" would be the condition, with death as the end date, but that is not very usable.

If cause of death is always a condition, then, yes, I would put it in the condition table with a death-cause type. (This makes sense if you ask the question, did the patient ever have such-and-such a condition, you would go to the condition table and the answer would be yes even if it killed them.) I would pick a death date and put it in the death table or in the person table. And if there is conflicting death information, like different dates, I would put that in the observation table. Then you can see the death table (or person table) as a derived table like condition_era. But you don't force every researcher to learn our death hierarchy and make value judgments, and infer death from the observation or condition table.

So the proposal becomes keep the death table, pull cause of death out of the death table into the condition table, and set up the vocabulary so that conflicting death dates can go into the observation table.

cgreich commented 6 years ago

Wonderful. I call that "After Hours Stock Trading". Which means, the debate doesn't happen on the floor of the house (CDM WG regular meetings), but outside.

I think we all agree that cause of death is just another Condition and close the discussion.

Now the death itself: I think we also all agree that we don't need a separate death table to just hold the death date. Now the question is where should it go.

From a philosophical perspective, you could claim that death is not a Condition, but a demographic fact, symmetrical to birth. Except, as discussed above, in our data birth is static (we don't have Persons that have yet to be born) and death is dynamic. You could also claim it is a Condition, the ultimate one. The fact that it has no end date (unless you want to apply Judgement Day when we all raise from the death and meet our creator) it shares with all other acute or rapidly fatal Conditions. Think cardiac arrest or asphyxiation.

I don't feel that strongly either way. Except if we put it into PERSON we have to make a change, and if we put it into Condition all we have to do is to add a convention.

So, the proposal would be to kill the DEATH table, and merge the information into the Condition table, and do a good job in the documentation.

gklebanov commented 6 years ago

to kill the DEATH table

hmmm...

pbr6cornell commented 6 years ago

i feel somewhat strongly that death shouldnt be in PERSON, for the reasons I outlined before: losing provenence, inconvenient updating, inconvenient to find all deaths (need to search for date is not null).

george proposed that death may be more appropriate as OBSERVATION rather than CONDITION, and Charlie made an argument for OBSERVATION also. these both seem roughly equivalent to me, and since the vocabulary is supposed to drive the domain choice of any concept, it seems 'death' concepts simply need to be reconciled to go to the same domain. currently it looks to me that most 'death' concepts are listed as observations, so purely on that basis, i would be happy to keep that designation and go with george and charlies counterproposal. cause of death records would all still be conditions. person would not change.

On Sat, Sep 8, 2018, 5:46 PM Christian Reich notifications@github.com wrote:

Wonderful. I call that "After Hours Stock Trading". Which means, the debate doesn't happen on the floor of the house (CDM WG regular meetings), but outside.

I think we all agree that cause of death is just another Condition and close the discussion.

Now the death itself: I think we also all agree that we don't need a separate death table to just hold the death date. Now the question is where should it go.

From a philosophical perspective, you could claim that death is not a Condition, but a demographic fact, symmetrical to birth. Except, as discussed above, in our data birth is static (we don't have Persons that have yet to be born) and death is dynamic. You could also claim it is a Condition, the ultimate one. The fact that it has no end date (unless you want to apply Judgement Day when we all raise from the death and meet our creator) it shares with all other acute or rapidly fatal Conditions. Think cardiac arrest or asphyxiation.

I don't feel that strongly either way. Except if we put it into PERSON we have to make a change, and if we put it into Condition all we have to do is to add a convention.

So, the proposal would be to kill the DEATH table, and merge the information into the Condition table, and do a good job in the documentation.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/OHDSI/CommonDataModel/issues/210#issuecomment-419674959, or mute the thread https://github.com/notifications/unsubscribe-auth/AAsrGnAPpIsCHZ_ogM-I4HyX-GnRXQ8Mks5uZDqpgaJpZM4WRu1r .

hripcsa commented 6 years ago

If we are going to put death in the condition or observation table, then we should prepare for it. New users will be looking for where death is stored, so we need to be able to point them to an explanation. We will see slightly different death algorithms in different studies: use the first or last date when there is more than one death date, which sources to trust the most, what to do when other data follow the death. Death I guess becomes a phenotype that needs to be programed and evaluated. Does Atlas treat it like any other condition or observation, where you supply the concept set? Or does it supply a preferred phenotype? (I also wonder, if it is important for the user to pick his or her own algorithm to decide death, why not the same for birth and race and ethnicity, which also come with conflicting information and also change over time as we gather new information. I guess for those, we can store the conflicting information in the observation table.) Therefore, if death is stored in condition or observation, perhaps we should enforce a single preferred death row chosen by the implementers, with conflicting information stored with codes that indicate that the information is considered secondary. If there is only a secondary row and no preferred row, then the user knows that the implementers don't actually believe it; they are just revealing all they know but it is coming from an unreliable source (and perhaps other clinical data followed the supposed death). The code could be called, "the report of my death was an exaggeration," to quote Twain.

gklebanov commented 6 years ago

george proposed that death may be more appropriate as OBSERVATION rather than CONDITION, and Charlie made an argument for OBSERVATION also.

it makes sense and I agree with that.

Therefore, if death is stored in condition or observation, perhaps we should enforce a single preferred death row chosen by the implementers, with conflicting information stored with codes that indicate that the information is considered secondary.

these should become one of new THEMIS business rules and added into ACHILLES Heel and into our future "OMOP CDM Validator"

If we are going to put death in the condition or observation table, then we should prepare for it.

and I think George is bringing up another great point - either of the changes proposed above will have an impact on CDM and thus queries and will render previous versions incompatible (removing DEATH table, changing business logic...). Maybe I am stating the obvious, but incompatible changes like that should be released as a part of the major version release e.g. CDM 6.0. We had already said that ATLAS 3.0 will target CDM 6.0 - this will give us a chance to implement these in ATLAS 3.0 (and other tools and methods) right from the start while leaving CDM 5.x and ATLAS 2.x users unaffected. Actually, we should be careful with ensuring backward compatibility across not just OMOP CDM but all of our tools, but this is for a separate post...

djb2188 commented 6 years ago

There is the right way and then the best way which seems to be the stage of decision making that we are at.

To thicken the plot I don't see why once we have a primary death we can't store that in the person table. Atlas and Achilles can always use observations but primary death does become static no? We can make a case for that. I am sure we can also make case that it would only complicate documentation and potentially result in invalid implementations.

While this is a large scale analysis stack we do work to bring others on board and sometimes making a seemingly simple concept like death more complex can cause resistance in the on-boarding process. Food for though. Observations will clearly be where we put it :)

cgreich commented 6 years ago

Friends:

I am lost. What are we still debating, actually? If I understand correctly, we agreed on:

No more DEATH table
Cause of death is a Condition, and the Death Types become Condition Types

The only remaining discussion is about where to put the death date, and what to do if there are many.

Many death dates: We already decided in THEMIS that it is the job of the ETL to figure out the single best death date, and that's the one. More than one death date makes no sense, and kicking the can down to the analyst because we cannot decide during ETL is against OMOP CDM principles.
Storage of deat date:
- PERSON: Don't want that because causes us to make a change to the CDM.
- OBSERVATION or CONDITION: I don't really care, but if you ask me death is a Condition. Observation? Really? Something we casually observe while talking to the patient? Of all things we collect in the CONDITION table it is the ultimate thing that can go wrong with the human organism. The one we all will get. And yes, it is acute (and also very chronic, depending whether you see it as an transformation or a permanent state).

After death the CDM will allow another 60 days of data to come in. That is also a THEMIS decision.

djb2188 commented 6 years ago

At this risk of infuriating you Christian :) - Dropping the death table is a change to the CDM I believe. Condition or observation - let the appropriate concept type drive the decision for consistency - seems logical and palatable and easy to explain. It seems unnatural to not have the option of adding death date to what is probably the person dimension? It's a fact and an attribute and how you model this depends on how you use it. Again, you are changing the CDM so a change should not be the reason. I am done pestering you.

Happy New Years to the community!

clairblacketer commented 6 years ago

Ok to sum this up as I have been following along, @cgreich stated:

We are agreed on:

No more DEATH table
Cause of death is a Condition, and the Death Types become Condition Types

As @gklebanov mentioned this is a huge change that breaks some backwards compatibility, which is why we are pushing it now so we can get it into CDM v6.0. Others around the community have mentioned their need to store multiple causes of death, which is where this idea originated.

To @hripcsa's point, we do need to make the documentation clear on how this should be implemented moving forward. There is a THEMIS rule currently that allows for more than one row in the DEATH table to capture multiple causes of death but this proposal would negate that. The additional THEMIS rule about multiple deaths on different days states

Only one death date per individual can be used. If a patient has clinical activity (e.g. prescriptions filled, labs performed, etc) more than 60+ days after death you may want to drop the death record as it may have been falsely reported. If multiple records of death exist on multiple days you may select the death that you deem most reliable (e.g. death at discharge) or select the latest death date.

Since we are already changing the concepts in the DEATH_TYPE vocabulary to fall under CONDITION_TYPE, I say we keep things (relatively) easy and use the SNOMED concept for death as it is currently - 4306655. This would put death dates in the OBSERVATION table and allow for multiple dates per person. Each study investigating death would then need to determine how they would choose a death date, which would negate part of the above THEMIS rule, though we can keep the convention to throw out death dates if patients have data 60+ days after death.

Finally, to make it somewhat easier on new adopters of the CDM, I can keep the wiki page on DEATH active but detail what we have discussed here and what the new conventions are moving forward.

alistairewj commented 6 years ago

It seems like a good opportunity to change the PERSON table to include the date of death - the most accurate one if more than one are available. The CDM is changing anyway, and awkwardly fumbling around in an OBSERVATION table for the primary outcome of most health research seems silly.

clairblacketer commented 6 years ago

Removal of DEATH Table

Initial Requester: @pavgra
Discussion: link to forum post.

Proposal

Relevant tables:

DEATH
PERSON

After much discussion on the the DEATH table, it seems like we are at a consensus that the DEATH table should be removed and causes of death should be stored in the CONDITION_OCCURRENCE table. We have been somewhat divided on where to store the date of death. Both the PERSON table and some event table (either CONDITION_OCCURRENCE or OBSERVATION) have been suggested. As a compromise, here is a possible solution:

DEATH table is removed.
Any cause of death should be stored in the CONDITION_OCCURRENCE table, using the DEATH_TYPE vocabulary, which will now be classified as condition types.
All observations relating to death should be stored in the OBSERVATION table, including the concept 4306655. This would also mean the CDM can store multiple dates of death in the OBSERVATION table.
A singular death date should be chosen by the ETL and stored in the PERSON table.

Field	Required	Type	Description
death_datetime	No	datetime	The date and time the person was deceased. If the precise date including day or month is not known or not allowed, December is used as the default month, the last day of the month the default day, and midnight as the default time.

Conventions

The official death_datetime in the PERSON table will only allow one date to be chosen per person, so this complies with the THEMIS rule about multiple death dates.
This also allows multiple causes of death to be recorded in the CONDITION_OCCURRENCE table, so this complies with the THEMIS rule on multiple deaths on the same day as well.
- The death_datetime in the PERSON table should not be used as the way to find all deaths
select * from PERSON where death_datetime is not null should not be the practice
Rather, deaths should be found through the OBSERVATION table and the PERSON table is only used to determine which death date should be used in analysis

pbr6cornell commented 6 years ago

Thank you @clairblacketer for offering an explicit proposal to consider that provides a nice compromise from amongst the various perspectives that we've heard from across the community.

We can maintain provenance of 'death observations' in the OBSERVATION table (which could allow for storage of multiple records that may exist), but the designation of the one death date in the PERSON table, so it seems we satisfy everyone's concerns.

I support this proposal. I do recognize that we'll want to have some recommendations for analysts for the most effective way to create a 'cohort of persons who are dead', since this proposal will make for a couple different alternatives. But that seems like a problem that can be readily solved with conventions that the community can evolve.

cukarthik commented 6 years ago

I agree. Thank you @clairblacketer. This is a nice compromise and I think makes the data model approachable for new users of OMOP.

djb2188 commented 6 years ago

Thank you to all for listening. This works well.

Best,

David J Blatt

On Sep 18, 2018, at 2:35 PM, cukarthik notifications@github.com<mailto:notifications@github.com> wrote:

I agree. Thank you @clairblacketerhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_clairblacketer&d=DwMCaQ&c=G2MiLlal7SXE3PeSnG8W6_JBU6FcdVjSsBSbw6gcR0U&r=EL1ISsYkpOQ9K6I-jMuKTM9lGx29hFS41hpO5VojRrc&m=8Y2TSZ8sPglyyiP2lIMzg8g6KAbicg0-jvyX3qYbEh8&s=3nkw_3WRkJphnp-3ICqpQYiTYhr7qPcmHNVukWbWcsY&e=. This is a nice compromise and I think makes the data model approachable for new users of OMOP.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_OHDSI_CommonDataModel_issues_210-23issuecomment-2D422498455&d=DwMCaQ&c=G2MiLlal7SXE3PeSnG8W6_JBU6FcdVjSsBSbw6gcR0U&r=EL1ISsYkpOQ9K6I-jMuKTM9lGx29hFS41hpO5VojRrc&m=8Y2TSZ8sPglyyiP2lIMzg8g6KAbicg0-jvyX3qYbEh8&s=jOW8vXDiZDURcT4QXq_RO9ji_ODAKcuVJsh2mVrrsQo&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AbK7NYLcVMQW988KNOEnx2NHdcyjHpS1ks5ucTrjgaJpZM4WRu1r&d=DwMCaQ&c=G2MiLlal7SXE3PeSnG8W6_JBU6FcdVjSsBSbw6gcR0U&r=EL1ISsYkpOQ9K6I-jMuKTM9lGx29hFS41hpO5VojRrc&m=8Y2TSZ8sPglyyiP2lIMzg8g6KAbicg0-jvyX3qYbEh8&s=6GFl94trRAvjLFEe7RLcFd3fA2jrcYu8KdPPp78SivI&e=.

juh7007 commented 6 years ago

210 DEATH table removed and cause of death now stored in CONDITION_OCCURRENCE

Hi, @clairblacketer...

I thought the death date will be added into the PERSON table as below... But I don't see this change made for CDM V6. Is this change coming soon? Thanks!

*** A singular death date should be chosen by the ETL and stored in the PERSON table.

Field	Required	Type	Description
death_datetime	No	datetime	The date and time the person was deceased. If the precise date including day or month is not known or not allowed, December is used as the default month, the last day of the month the default day, and midnight as the default time.

clairblacketer commented 6 years ago

Hi @juh7007 I see that it was somehow missed in BigQuery but present in the other dialects. This is fixed now.

Thanks!

juh7007 commented 6 years ago

Thanks, @clairblacketer !!!

OHDSI / CommonDataModel

Store death causes in condition_occurrence table #210

Removal of DEATH Table

Proposal

Conventions

210 DEATH table removed and cause of death now stored in CONDITION_OCCURRENCE