JuliaHealth / OMOPCDMCohortCreator.jl

Create cohorts from databases utilizing the OMOP CDM
https://juliahealth.org/OMOPCDMCohortCreator.jl/stable
Other
8 stars 7 forks source link

[FEATURE] Create Queries That Return Cohort Information #58

Closed TheCedarPrince closed 7 months ago

TheCedarPrince commented 8 months ago

It dawned on me that after running an OHDSI Cohort Expression, that oftentimes, we would just like to get the patient cohort directly from the output of that expression execution. Usually, these cohorts are stored in an auxiliary table in a given database called cohort. We should have a few new Getter functions which does the following:

  1. Get the names of created cohorts in the cohort table
  2. Get all patients associated with a particular cohort

Those are at least two things I can think of now. I know sometimes cohorts will have more metadata associated with them, but I think this would be good to bring initial support to this package.

TheCedarPrince commented 8 months ago

Thanks @jay-sanjay for being interested in this issue! Here's a skeleton for the way to address this issue (i.e. how to create new getters):

The process of creating families of a getter function is as follows:

  1. Look up the table we want to create getters for on the OMOP CDM 5.4 page. In this case, we'd want to look up the COHORT table which is here: https://ohdsi.github.io/CommonDataModel/cdm54.html#COHORT
  2. Once we know what fields we are working with, we can generally do the following:
    1. Using FunSQL, design, as best as possible, an ANSI SQL representation that "gets" the requested field
    2. Create a few dispatches of each new function:
      1. A dispatch which accepts any necessary kwargs but no connection object -- this should return the SQL representation of the query
      2. A dispatch that accepts kwargs and a connection object that returns a DataFrame with requested information
      3. A dispatch that accepts a DataFrame that can then be mutated by the function and returns the dataframe (this is not always applicable and should be reviewed with myself or maintainer like @Farreeda )
    3. For each function and dispatch of the function, there should be at least one test each.
    4. Add function documentation to the documentation website
  3. Open a PR and ping me or @Farreeda for review!
  4. Review PR together
  5. Merge and then throw a party :partying_face:

In this case, I can see at least a few functions we may want to create:

The names are a bit long but that is what I am imagining! I also imagine we should make a couple additional functions for COHORT_DEFINITION but we can worry about that in a separate issue! :D Thanks for the question on this and I hope that was helpful! I'd highly suggest looking at the other source code for some of the other getters for inspiration on how to tackle this. Also, the FunSQL documentation is fantastic here: https://mechanicalrabbit.github.io/FunSQL.jl

Let me know if you have any questions! ~tcp :deciduous_tree: