Open scheah92 opened 1 week ago
Hi - you've got it right about the generateCohortSet
function - under the covers, it is using SqlRender as you described. Can you confirm you are using the latest version of SqlRender? I think the behavior you have described will still be the same but worth confirming just to be sure.
I did a quick check of the query behavior using https://data.ohdsi.org/SqlDeveloper/ and noted the same query rendering. I'm not really familiar with Spark/Databricks so I'm tagging a few folks that might be able to chime in and help: @fdefalco @greshje
Yes, I have SqlRender version 1.18 installed.
As far as I'm aware,
generateCohortSet
uses SqlRender to translate the source query into the correct syntax. This includes interpolating schema names and table names where necessary. I am using Databricks aka spark as my dbms, but I am unable to rungenerateCohortSet
due to an SQL error.Firstly, as a reminder, when using
spark
as dbms withDatabaseConnector
, it is necessary to specify the catalog name inside the connection string because there aren't any parameters yet to pass in catalog names for Databricks connections. Once catalog name is set in the connection string,DatabaseConnector
can continue to use the supplied schema name parameters as required.For this example, assume that
Take a look at the first bit of the source query for a cohort I'm creating.
generateCohortSet
first callsSqlRender::render
which produces this with interpolated schema namesAs you can see,
@vocabulary_database_schema
was correctly interpolated, producing a valid table identifier with the<schema>.<table>
naming convention, without the<catalog>
name prepended because it is already assumed from the connection string.It then calls
SqlRender::translate
withtargetDialect='spark'
which then producesThis will now produce an error because now spark is trying to create the table
yz8flonqCodesets
that uses the<table>
naming convention, which means spark tries to fill in the missing<schema>
name by using the valuedefault
, which does not exist in my catalog. This is the following error message that occurs.From what I can see, it seems all the
#
placeholders like#Codesets
,#qualified_events
,#final_cohort
should also be prepended by a schema placeholder, like@cohortDatabaseSchema
.Please let me know if I am using this function wrongly somehow, otherwise could somebody point me to a suitable workaround for spark? Should I just create a
default
schema for now?