OHDSI / CohortGenerator

An R package for instantiating cohorts using data in the CDM.
https://ohdsi.github.io/CohortGenerator/
11 stars 10 forks source link

SQL cohorts as first class citizens - SQL templates for non-standard cohorts or large, bulk operations #133

Open azimov opened 3 months ago

azimov commented 3 months ago

Outline

The ideal cohort definition is a Circe based cohort and this package does a great job of standardizing this. However, this is currently limiting in a number of situations:

  1. When generating Cohorts in bulk, this is largely infeasible due to the inefficiency of generating the same cohort over and over
  2. We have cohorts that do not work well with the Circe approach (e.g. the pregnancy algorithm).

Naturally, this has lead to other approaches or hacks in studies to generate these cohorts. For example, this is used in the comparator selection explorer and Reward packages and this functionality could easily be used in other places but currently that is difficult to support. There are also, likely, many studies kicking about that simply use raw SQL execution and ignore this approach.

Proposed Solution

Allow "Template" cohort definitions which are SQL based definitions of cohorts (e.g. all rxNorm ingredients, all SNOMED codes) that produce references to allow them to be handled functionally the same as Circe cohorts.

To implement a cohort definition that uses templates the user should define the references (which can come from the vocabulary or be predefined) as well as an SQL definition.