OHDSI / CohortConstructor

https://ohdsi.github.io/CohortConstructor/
Apache License 2.0
1 stars 0 forks source link

Tell user how many records are dropped for being out of observation in conceptCohort #188

Open edward-burn opened 4 weeks ago

edward-burn commented 4 weeks ago

So that the cohort table satisfies omop cdm cohort requirements we drop records that start outside of an observation period. It would be nice, I think, if we printed a cli message telling the user how many records were dropped for this reason - maybe with both the n and % of total records (as the latter is likely a good indicator of the size of the impact).

We could also include this in the attrition attribute

catalamarti commented 3 weeks ago

I would be hesitant to print it in the attrition because you can not create a cohort with records not in observation

edward-burn commented 3 weeks ago

But for IncidencePrevalence it's been useful in the past to know more about how the inital cohort was created to pick up potential etl problems etc on the data partner side. Like for me this would be a nice way of prompting the data partner to realise the importance of having records within observation

library(IncidencePrevalence)
cdm <- mockIncidencePrevalenceRef()
cdm <- generateDenominatorCohortSet(cdm, "denom")
#> Loading required namespace: testthat
#> ℹ Creating denominator cohorts
#> ✔ Cohorts created in 0 min and 2 sec
attrition(cdm$denom) |> 
  dplyr::glimpse() 
#> Rows: 8
#> Columns: 7
#> $ cohort_definition_id <int> 1, 1, 1, 1, 1, 1, 1, 1
#> $ number_records       <int> 1, 1, 1, 1, 1, 1, 1, 1
#> $ number_subjects      <int> 1, 1, 1, 1, 1, 1, 1, 1
#> $ reason_id            <int> 1, 2, 3, 4, 5, 6, 7, 10
#> $ reason               <chr> "Starting population", "Missing year of birth", "…
#> $ excluded_records     <int> NA, 0, 0, 0, 0, 0, 0, 0
#> $ excluded_subjects    <int> NA, 0, 0, 0, 0, 0, 0, 0

Created on 2024-06-03 with reprex v2.1.0