ohdsi-studies / ScyllaEstimation

This OHDSI network study assesses the comparative effectiveness and safety among treatments administered during hospitalization and prior to intensive services. It also assesses the comparative effectiveness and safety among treatments administered after COVID-19 positive testing or diagnosis in the outpatient setting without prior hospitalization.
1 stars 1 forks source link

Error: Join columns must be present in data. #11

Open alabarga opened 3 years ago

alabarga commented 3 years ago

We are facing this error running the study, any ideas? Find attached generated errorReportR.txt

Running CohortMethod analyses
Error: Join columns must be present in data.
✖ Problem with `targetId`.
Backtrace:
     █
  1. ├─ScyllaEstimation::execute(...)
  2. │ └─ScyllaEstimation::runCohortMethod(...)
  3. │   └─`%>%`(...)
  4. ├─dplyr::inner_join(., analysisDescription, by = "analysisId")
  5. ├─dplyr::inner_join(...)
  6. ├─dplyr::inner_join(...)
  7. ├─dplyr::inner_join(...)
  8. └─dplyr:::inner_join.data.frame(...)
  9.   └─dplyr:::join_mutate(...)
 10.     └─dplyr:::join_cols(...)
 11.       └─dplyr:::standardise_join_by(by, x_names = x_names, y_names = y_names)
 12.         └─dplyr:::check_join_vars(by$x, x_names)
An error report has been created at  /data/ScyllaEstimation/errorReportR.txt
alabarga commented 3 years ago

the cohort table in the CDM looks like

cohort_definition_id subject_id cohort_start_date cohort_end_date
1009 4579522779730604891 2020-02-09 2020-02-09
1009 6731742195644174274 2020-02-08 2020-02-10
1009 3688281720704476065 2020-08-28 2020-08-28
schuemie commented 3 years ago

Perhaps the exposure cohorts are empty? Could you check your cohort_counts.csv file to see if any of the exposure cohorts (e.g. 'Hydroxychloroquine with Treatment administered on the date of admission of hospitalization and prior to intensive services and 365d prior observation') has a non-NA count?

Also, could you check if any of these files exist? (in the same folder as cohort_counts.csv)?:

alabarga commented 3 years ago

not all, but some have non-NA values

cohort_id name cohort_entries cohort_subjects database_id
1001000011 Hydroxychloroquine with Treatment administered on the date of admission of hospitalization and prior to intensive services and 365d prior observation 313 313 hdm
1002000011 Hydroxychloroquine + Azithromycin with Treatment administered on the date of admission of hospitalization and prior to intensive services and 365d prior observation 18 18 hdm

however, no *.rds files are present

schuemie commented 3 years ago

So the problem appears to be that most exposure cohorts are either empty or very small, leaving no comparisons with sufficient data.

This study package doesn't really allow you to find out why there are so few people meeting the cohort criteria. Did you run ScyllaCharacterization?