OHDSI / CohortDiagnostics

An R package for performing various cohort diagnostics.
https://ohdsi.github.io/CohortDiagnostics
41 stars 49 forks source link

Fit for use diagnostics - is a cohort definition implemented in a data source fit for use #480

Open gowthamrao opened 3 years ago

gowthamrao commented 3 years ago

In the presence of data source heterogeneity - can we empirically determine if a data source is fit for use of a phenotype or cohort definition. Datasource heterogeneity maybe due to differences in underlying population that contribute to the data source and because of data capture processes.

Data Source level determination - is a datasource fit for use for a study?

Researchers make determination on fit for use for their study based on judgment on whether a data source faithfully represents the underlying population being studied. First step aiding the determination is review of source data documentation (e.g. documentation provided by data partner) to answer questions like:

This is followed by understanding the source data capture processes that were used to populate the clinical experience of the population - is it accurate and complete.

OHDSI tools allows for further empirical determination: OHDSI has tools (Achilles characterization and Data Quality Dashboard) that present data source level characteristics. These tools provide empirical data that may be used in conjunction with data source documentation to make determination.

Cohort level determination - are cohort definition(s) being used in a study appropriate for the datasource? While data source level determination is a higher level decision making that is based on whether an underlying population is fit for use for the study, the cohort level determination is a more granular determination that helps determine if a cohort definition faithfully extracts the right cohort from the population in the data source. A common example is that because a data source has different data capture processes (coding practice) - it may not be yielding the right set of persons in the cohort (because of orphaned codes).

Cohort Diagnostics is a new tool that provides us diagnostics on cohorts instantiated on a data source. It enables both within and across data source comparison (i.e. a cohort instantiated one data source may be compared to cohort instantiated in another data source). Determinations may be made, based on observations on diagnostics, if a cohort definition as instantiated on one datasource is systematically different (and potentially different from expected) compared to other data sources.

Determination based on comparison to expected: Cohort diagnostics allows us to compare if attributes of a cohort as instantiated in a data source is comparable to expected. e.g. if we are building a cohort of persons who are pregnant, we have an apriori expectation that the age distribution to be in child bearing age. If we observed in the data source, that the age distribution is not in expected range - then it may indicate that the cohort definition as applied to a datasource may not be fit.

Determination based on comparison to other datasources: Similarly, it is possible to compare if one data source is similar or different to a set of benchmark datasources.

However, these determination are currently less systematic. The amount of data points is overwhelming. We lack a rubric/decision making rule that allows us to empirically accept/reject a datasource for a study. One approach to solve this is to build a 'empirical metric' for fitness of use - and use that metric. Approaches to compute such a metric includes 'similarity' metrics, distance metrics, or even setting an expected benchmark based on a starter set of datasources, and compare the new datasoruces to the benchmark.

gowthamrao commented 3 years ago

Examples of metrics that may suggest if a cohort definitions implemented in 11th data source is similar to previous 10 datasources

gowthamrao commented 3 years ago

Tagging @clairblacketer @jreps as this idea came from a recent discussion

gowthamrao commented 3 years ago

Candidate Cohort level attributes:

gowthamrao commented 3 years ago

Levels of diagnostics:

  1. Fitness of data source - which can be a collection of studies
  2. Fitness of study - which can be collection of cohort definitions
  3. Fitness of cohort definitions - which is 1 cohort definition
gowthamrao commented 2 years ago

the cohort level determination is a more granular determination that helps determine if a cohort definition faithfully extracts the right cohort from the population in the data source. A common example is that because a data source has different data capture processes (coding practice) - it may not be yielding the right set of persons in the cohort (because of orphaned codes).

A practical use case is when within a specific network study more data sources may be incrementally added (new databases) or an old data source may have updated (updated version). The cohort definitions used in a study were NOT evaluated on these new/updated databases.

  1. What would we do in situations like these?
  2. Could we define a set of go / no go heuristics for cohort diagnostics, that can be used as a first best guess? This might be a series of acceptance checks that are run every time the underlying data changes - and we get an alert/report that has red/orange or green flags for each check.

This would help us identify scenarios where:

  1. A databased was previously accepted, but now has a new version - which is not accepted.
  2. New database id wants to participate in a study - but because of several red flags - is a candidate for rejection from participation in the study.

The focus is on study cohort + database diagnostics with an intent to automate those to atleast an alerting system that flags issues for review - instead of having to parse thru all diagnostics for all cohorts each time there is a new/updated database.

The initial set of diagnostic rules are documented here

azimov commented 1 month ago

@gowthamrao I would like to consider this as an additional "tag" on a cohort in the phentype library but how we do this is difficult as currently diagnostics are subjectively interpreted based on output. E.g. we don't have a binary pass or fail based on some numeric value, like CohortMethod or SCCS, so inclusion here may be difficult.