OHDSI / Achilles

Automated Characterization of Health Information at Large-scale Longitudinal Evidence Systems (ACHILLES) - descriptive statistics about a OMOP CDM database
https://ohdsi.github.io/Achilles/
130 stars 122 forks source link

Column name case sensitivity error in duckdb #766

Open katy-sadowski opened 1 month ago

katy-sadowski commented 1 month ago

I'm getting the following error when I try to run Achilles on duckdb:

cdmDatabaseSchema <- "omop" 
cdmVersion <- "5.4" 
cdmSourceName <- "Synthea"
resultsDatabaseSchema <- "main"

dbms <- "duckdb"
server <- "C:\\Users\\sadowskk\\R Scripts\\synthea_500_omop.duckdb"

connectionDetails <- DatabaseConnector::createConnectionDetails(dbms = dbms, 
                                                                server = server)

Achilles::achilles(connectionDetails = connectionDetails, 
                     cdmDatabaseSchema = cdmDatabaseSchema, 
                     resultsDatabaseSchema = resultsDatabaseSchema, 
                     sourceName = cdmSourceName, 
                     createTable = TRUE, 
                     outputFolder = outputFolder,
                     cdmVersion = cdmVersion, 
                     numThreads = 1, 
                     optimizeAtlasCache = TRUE,
                     createIndices = FALSE,
                     defaultAnalysesOnly = TRUE)

CDM Version 5.4 passed as parameter.
CDM Version 5.4 passed as parameter.
Beginning single-threaded execution
Beginning single-threaded execution
Connecting using DuckDB driver
  |=========================================================================================| 100%
Executing SQL took 0.0122 secs
An error report has been created at  C:\Users\sadowskk\output\/errorReportR.txt
Error in dbAppendTable(conn, name, value) : 
  Column `ANALYSIS_ID` does not exist in target table.

Full traceback:

13: h(simpleError(msg, call))
12: .handleSimpleError(function (condition) 
{
    if (is(condition, "error")) {
        log("FATAL",
11: stop("Column `", setdiff(names(value), target_names)[[1]], "` does not exist in target table."
10: dbAppendTable(conn, name, value)
9: dbAppendTable(conn, name, value)
8: .local(conn, name, value, ...)
7: dbWriteTable(conn, dbQuoteIdentifier(conn, name), value, ...)
6: dbWriteTable(conn, dbQuoteIdentifier(conn, name), value, ...)
5: DBI::dbWriteTable(conn = connection@dbiConnection, name = tableName, value = data, overwrite =
4: DBI::dbWriteTable(conn = connection@dbiConnection, name = tableName, value = data, overwrite =
3: insertTable.DatabaseConnectorDbiConnection(connection = connection, databaseSchema = resultsDa
2: DatabaseConnector::insertTable(connection = connection, databaseSchema = resultsDatabaseSchema
1: Achilles::achilles(connectionDetails = connectionDetails, cdmDatabaseSchema = cdmDatabaseSchem

It seems that insertTable is adding quotes around the csv column names, which in the achilles analysis csv are in all-caps. However, the achilles_analysis table is created with lowercase column names. Quoted identifiers are case sensitive in duckdb: https://duckdb.org/docs/sql/dialect/keywords_and_identifiers.html#case-sensitivity-of-identifiers

I wasn't actually sure at what level to consider this a bug - Achilles, DatabaseConnector, or SqlRender :) let me know and I can move this report if needed.

fdefalco commented 1 month ago

This is getting "standardized" so that the columns will be lower case across the packages. I believe if you run off the development branch you should have success. Fix on its way.

katy-sadowski commented 1 month ago

Awesome! Thanks Frank!

ablack3 commented 3 weeks ago

I had the same issue with the current release and the develop branch worked. Thanks!