In the context of having too many visits (more than 1 per day on average), we have been merging visits divided by monthly bills. This ticket addresses some 1-to-many mappings that may cause too many visits, but we should re-evaluate the heuristic.
I ran into discrepancies between row_counts and counts of distinct natural keys in the visit_occurrence table, and I think this goes back to issues in the OMOP concept_ancestor table. So that needs entered against OMOP, and we need to consider solutions or tests until that's fixed.
COUNTS
When creating the visit_occurrence table by rolling up lt_visit_detail rows, I count resulting rows and compare that to the number of unique combinations of (person_id, care_site_id, provider_id, visit_start_date, visit_end_date, visit_concept_id). They are different by close to a factor of 2 here. The latter is what I consider the number of entities in that table. While we have a surrogate primary key, it contains more fields and can hide issues.
CAUSE
querying for count(distinct X) where X is a list of fields that should be unique for each entity, I found that some visits showed up as both in-patient and out-patient. Digging further, I looked that the join with the cms_medicaid_place_of_visit_xwalk had multiple entries for AO, Hospital. One to in-patient and one to out-patient. Still deeper, it looks like it comes from data in the OMOP concept_ancestor table.
Have a look:
verify that an ancestor concept is more-general and a descendant more specific. Like Fruit is general and apple is more specific. Fruit would be the ancestor and Apple the descendant.
Do an example, like Heart Disease (321588), Chronic Heart Disease (4134596) and Chronic Myocarditis (764141). These are listed in descending or more specific order. If you look them up in athena, you'll see the order, and there, the arrows point to more-specific concepts.
Now look at Hospital (38004515), In Patient (9201) and Out Patient (9202). Go slow if you're dyslexic. BTW, concepts 8717 and 8756 are steps between Hospital and the other two.
one is to use the first() function and a group-by on the natural primary key listed above to arbitrarily pick on of them. Arguably the concepts are wrong, but at least you're just picking one and not throwing counts off. (done)
next is to get this fixed in OMOP so you don't need the previous "fix"
In the context of having too many visits (more than 1 per day on average), we have been merging visits divided by monthly bills. This ticket addresses some 1-to-many mappings that may cause too many visits, but we should re-evaluate the heuristic.
I ran into discrepancies between row_counts and counts of distinct natural keys in the visit_occurrence table, and I think this goes back to issues in the OMOP concept_ancestor table. So that needs entered against OMOP, and we need to consider solutions or tests until that's fixed.
COUNTS
When creating the visit_occurrence table by rolling up lt_visit_detail rows, I count resulting rows and compare that to the number of unique combinations of (person_id, care_site_id, provider_id, visit_start_date, visit_end_date, visit_concept_id). They are different by close to a factor of 2 here. The latter is what I consider the number of entities in that table. While we have a surrogate primary key, it contains more fields and can hide issues.
CAUSE
querying for count(distinct X) where X is a list of fields that should be unique for each entity, I found that some visits showed up as both in-patient and out-patient. Digging further, I looked that the join with the cms_medicaid_place_of_visit_xwalk had multiple entries for AO, Hospital. One to in-patient and one to out-patient. Still deeper, it looks like it comes from data in the OMOP concept_ancestor table.
Have a look:
FIX