OHDSI / CommonDataModel

Definition and DDLs for the OMOP Common Data Model (CDM)
https://ohdsi.github.io/CommonDataModel
896 stars 451 forks source link

Use of SqlRender to generate DDL #118

Closed vojtechhuser closed 6 years ago

vojtechhuser commented 7 years ago

We discussed with Clair and Martijn a scenario, where DDL is "aided/facilitated by" SqlRender.

That we maintain DDL in MS SQL and some dialects can be generated from that. char (1) vs. varchar (1) breaks this option.

Email discussion paste: I guess the problem is to:

take the MS SQL code for creating CDM tables and use SqlRender to create other flavors of it.

Something like this R code below

library(SqlRender);library(RCurl)

#specify URL of sql code written in parametized SQL        (see some examples below)
# url<-'https://raw.githubusercontent.com/OHDSI/CommonDataModel/master/Version4%20To%20Version5%20Conversion/OMOP%20CDMv4%20to%20CDMv5%20-%20OHDSI-SQL.sql'
# url<-'https://raw.githubusercontent.com/OHDSI/Achilles/master/inst/sql/sql_server/export_v5/drugera/sqlPrevalenceByGenderAgeYear.sql'
url<-'https://raw.githubusercontent.com/OHDSI/CommonDataModel/master/Sql%20Server/OMOP%20CDM%20ddl%20-%20SQL%20Server.sql'
#get the code
sql<-getURL(url)

#inspect what parameters are needed by searching for @

#decide your parameters
# results='myCDM'
# vocab='public'

#fill in parameters
# tta<-SqlRender::renderSql(sql,results_database_schema=results,vocab_database_schema=vocab)
tta<-SqlRender::renderSql(sql)
#translate into target dialect
ttb<-SqlRender::translateSql(tta$sql,targetDialect = 'postgresql')$sql
cat(ttb)
#and do it for all dialects

I inspected briefly the generated code and I think postgres create table statements can probably be derived from MS SQL version. (in fact maybe the author is already doing it….) I did not look at Oracle or Impala. Correction - In fact I just did look at Impala. And it needs additional items after creation of table.

e.g., distkey(person_id) interleaved sortkey(person_id, drug_concept_id, drug_era_start_date, drug_era_end_date);

But probably can also be partly generated….

Vojtech

From: Martijn Schuemie [mailto:schuemie@ohdsi.org] Sent: Thursday, August 10, 2017 2:42 PM To: Huser, Vojtech (NIH/NLM/LHC) [E] vojtech.huser@nih.gov; Blacketer, Margaret [JRDUS] mblacke@ITS.JNJ.com Subject: RE: DDL and SQL Render

I think simple create table statements should be fine (part of ansi sql) without translation. What problems specifically are you running into?

Cheers, Martijn

From: Huser, Vojtech (NIH/NLM/LHC) [E] [mailto:vojtech.huser@nih.gov] Sent: Tuesday, August 08, 2017 10:47 AM To: Blacketer, Margaret [JRDUS] Cc: Martijn Schuemie Subject: DDL and SQL Render

I don’t think the create table statements are covered by the SQLRender “translation patterns”. https://github.com/OHDSI/SqlRender/blob/master/inst/csv/replacementPatterns.csv

I had the same thought and did not pursue it due to this limitation.

I even looked at SQL alchemy to do it via that .

But maybe it could be added to SQL Render… (CCing martijn)

Vojtech

clairblacketer commented 7 years ago

I love this idea and I think it would be very useful to have the DDLs written in this way. Thank you @vojtechhuser for getting a head start!

vojtechhuser commented 7 years ago

I will create a subfolder in CodeExcerpts and do a PR with R code that does the generation of DDLs from master code in mssql.

clairblacketer commented 6 years ago

This was done with v5.3.0