OHDSI / GIS

https://ohdsi.github.io/GIS
Apache License 2.0
10 stars 9 forks source link

Integration with ATLAS/WebAPI #11

Open rtmill opened 5 years ago

rtmill commented 5 years ago

The current paradigm in OHDSI is to package up cohort definitions into JSON objects which can be translated into singular SQL statements that fully define the cohort. Some of these definitions require calculations (e.g. 'where value is between x and y') but all of which are completed entirely in SQL. In our circumstance, given the lack of compatibility for GIS functionality among all DB flavors, we cannot package everything into SQL statements unless we have every possible calculation already precalculated and stored, which seems inadvisable if not impossible.

The question becomes, how could we expand the OHDSI cohort definition to include functionality outside of SQL.

Example use cases:

id Use case Result
1 Patients who lived in area x for date y list of person_id
2 Patients who live in areas with measurement of x list of person_id
3 Visits where the patient traveled more than an hour for CT scan list of visit_occurrence_id
4 Care sites that have 3 or more dialysis clinics in same county list of care_site
rtmill commented 5 years ago

Example cohort definition:

Text View image

JSON: image

SQL (MS)

image

rtmill commented 5 years ago

It appears everything in the cohort definition boils down to smaller data sets that are joined. One approach would be to have the spatial queries run before the SQL and populate tables, most likely temporary, that are then included in the SQL joins.

Using example 1 from above (patients who lived in area x). An R function would find all people that lived in area x and populate a temporary table with 'person_id'. That table is then referenced and joined in the SQL statement, then deleted after execution.

How that would be completed functionally, specifically with staggered execution and consistent naming, and how these functions could be represented in the JSON object is unclear. Perhaps a conversation with someone from the WebAPI WG?

cgreich commented 5 years ago

@rtmill : Are you getting any input on these?

rtmill commented 5 years ago

@cgreich I had a great call with @anthonysena and I believe the plan is to discuss with this the ATLAS/WebAPI WG and then involve the folks from Circe

ablack3 commented 4 years ago

@rtmill - Updating this issue after the 2020 OHDSI Symposium where I had the opportunity to demo work done on a geospatial Atlas component. This version does allow cohorts to be built off of geospatial concepts and uses the new geospatial vocabularies in Athena: Open Street Map and US Census. This architecture is database agnostic and all relationships between locations and regions are pre-computed in the concept relationship table. Custom regions would be added to the concept table and location-region relationships would be pre-computed during ETL. It won't fit every use case but does provide a starting point.

Requirements for what was implemented: https://github.com/OHDSI/WebAPI/issues/649

Code for the new features: https://github.com/OHDSI/webapi-component-geospatial https://github.com/OHDSI/atlas-component-geospatial

Video demo: https://youtu.be/6OebK5CfYo0

After listening to the discussion in the GIS WG I think the future of this work would align well with the work to modularize Atlas and integrate R components into Atlas. Arachne Execution Engine is a start at this concept and provides an R execution environment as an Atlas component.