darwin-eu-dev / omopgenerics

https://darwin-eu-dev.github.io/omopgenerics/
Apache License 2.0
2 stars 1 forks source link

Roadmap to version 1.0 #462

Open ablack3 opened 1 month ago

ablack3 commented 1 month ago

Hi @catalamarti,

Let's make a roadmap for what we would need in order to get omopgenerics to version 1.0. Are there any interface changes you are considering or any reason we could not release 1.0 in the next few months?

What do the developers think about the summarized result format?

tagging @edward-burn, @ginberg, @mvankessel-EMC, @cebarboza

ablack3 commented 1 month ago

One thing I have a hard time with is the storage of all results as character strings. This format seems fine for saving and transporting but not for analysis. I see Warning NAs introduced by coercion a lot. I don't think current summarized result format would be good to put in a database because we would want to analyze the data in the database directly without having to cast each result/estimate to the correct data type. So I think there should be another representation of results that would sit behind analytic apps where the estimates/result values are in the data correct type. Perhaps we need multiple tables and possibly different fields for the different analysis types (e.g. treatment patterns, incidence prevalence, etc).

The approach Hades takes is to allow each analytic package to define a results data model and provide a clear specification of that model. Here is the model for SCCS for example. https://github.com/OHDSI/SelfControlledCaseSeries/blob/main/inst/csv/resultsDataModelSpecification.csv