EHDEN / ARACHNE

Issue tracking for ARACHNE
5 stars 0 forks source link

Manual Execution #1

Open PRijnbeek opened 5 years ago

PRijnbeek commented 5 years ago

When i create a Cohort Count Analysis i only get the sql file and a zip file with some different dialects of that sql file.

Question 1: Why do i get all the dialects? Should this not go through SQL Render to execute where I translate to my DBMS?

Question 2: I would expect something i can run against my database instead of a sql file that i myself have to execute to generate the cohort and the have to write code for the extract the count that i have to upload back to Arachne.

I can see different solutions.

1) We build an executable that can run the sql and output a zip that I can upload back. Can be in R as a Shiny App that hosts my connection details so I can quickly execute this against any of my datasources.

2) Data Node allows for manual input of something that has to be executed against my data sources and generates the output that I can then manually return back to Arachne Central.

The second option would mean I have to install Data Node in my internal network (should be to be completely offline) and it should be able to execute R code or SQL dependent on the study type. I know there are plans to upgrade data node. Is this part of it?

gklebanov commented 5 years ago

Question 1: Why do i get all the dialects? Should this not go through SQL Render to execute where I translate to my DBMS? Because with manual execution, the mechanism of execution or databases are not known The user is simply getting all possible combinations so that they can choose. This is a convenience feature but for the record and reproducibility. Those combinations are being generated by SQLRenderer. Also, keep in mind that not everyone have ATLAS installed so they might not even have ability to access SQLRenderer easily.

Question 2: I would expect something i can run against my database instead of a sql file that i myself have to execute to generate the cohort and the have to write code for the extract the count that i have to upload back to Arachne. The second option would mean I have to install Data Node in my internal network (should be to be completely offline) and it should be able to execute R code or SQL dependent on the study type. I know there are plans to upgrade data node. Is this part of it?

Yes, exactly - personally, I am very excited about adding this new capability. As a part of our next release (see my comment on data node revamping), we are adding a new Data Node feature to enable better support for manual execution. The data node query stewards will be able to download requested query from ARACHNE Central and use Data Node UI to execute it against configured data sets. The results will come back packaged into the standard format that would allow them then to upload into ARACHNE Central. There are multiple benefits from that:

  1. Guaranteed clean execution since the Execution Engine is built to guarantee a clean environment and consistent Standard and OHDSI R libs for each query
  2. Submissions can be done by just utilizing a standard JSON for all standard OHDSI analyses (or custom code if required)
  3. Results are guaranteed to come back packaged into a standardized exchange format, including metadata that systems like ARACHNE and future results repo can read and interpret.