clinicalml / omop-learn

Python package for machine learning for healthcare using a OMOP common data model
MIT License
103 stars 26 forks source link

Questions - OMOP-LEARN #6

Closed Ak784 closed 3 years ago

Ak784 commented 3 years ago

hello,

I went through the documentation and have a couple of questions

Q1) does omop-learn only support binary classification task as of now?

Q2) is there any youtube tutorial which walks us through a simple example? I know there is a python notebook for eol.

justinlimkz commented 3 years ago

Hi there, thanks for your questions and for looking at the package!

  1. omop-learn does support non-binary classification tasks. For instance, in the end-of-life linear model notebook, the task is set up via the SQL query Cohorts/gen_EOL_cohort.sql, which defines the response variable 'y' as a binary yes/no outcome. You can modify this cohort query to let 'y' be anything you want, e.g. a continuous value. You would run the code in the notebook to get the feature_matrix_counts (i.e. the X) and outcomes_filt (i.e. the y), and then use them as you wish, e.g. by plugging them into a LinearRegression model from sklearn.
  2. We currently do not have a Youtube tutorial, and for now the "tutorials" are the notebooks themselves. But we'll keep that in mind as a possible next step!
Ak784 commented 3 years ago

Hi @justinlimkz ,

Thanks. Additionally, may I check with you on couple of things?

I see that omop-learn provides us an option to generate cohorts using SQL but is it possible to make use of a cohort created and stored in results schema of OMOP cdm tables. Usually clinical data is stored in CDM schema of OMOP instance. However, we can create cohorts using a web-ui tool called Atlas (provided by OHDSI) and such cohorts (created using Atlas) are stored in results schema of OMOP instance. So, we get the dates when a subject entered and left the cohort from the results schema. So, Am I right to understand the below

a) We can skip writing sql for Cohort generation. Instead refer to a specific table to get the subjects in a cohort using omop-learn?

b) Later, we can apply the feature generation on these set of subjects.

justinlimkz commented 3 years ago

Hi there,

That's a great question. At the moment, I believe that cohort queries created via Atlas are not directly compatible with omop-learn, but you might be able to modify those queries to fit the template in /sql/Cohorts, so that it has the same output columns as expected by omop-learn.

Best, Justin