EHDEN / CdmInspection

R Package to support quality control inspection of an OMOP-CDM instance
Apache License 2.0
11 stars 16 forks source link

Error on running cdmInspection() #98

Closed apopassias closed 10 months ago

apopassias commented 10 months ago

Running cdmInspection() we get an error as follows:

results<-cdmInspection( connectionDetails = connectionDetails, cdmDatabaseSchema = "x", resultsDatabaseSchema = "x", vocabDatabaseSchema = "x", oracleTempSchema = "x", databaseId, databaseName="", databaseDescription = databaseDescription, runVocabularyChecks = TRUE, runDataTablesChecks = TRUE, runPerformanceChecks = TRUE, runWebAPIChecks = TRUE, smallCellCount = smallCellCount, baseUrl = baseUrl, sqlOnly = FALSE, outputFolder = "C:/CDM_OUTPUT", verboseMode = TRUE ) Connecting using Oracle driver using THIN to connect Connecting using Oracle driver using THIN to connect CDM Inspection of database started (cdm_version=5.4) Running Data Table Checks Connecting using Oracle driver using THIN to connect Data tables count query executed successfully in 1.74 secs Connecting using Oracle driver using THIN to connect Total number of records over time query executed successfully in 0.29 secs Connecting using Oracle driver using THIN to connect Number of records per person query executed successfully in 0.15 secs Connecting using Oracle driver using THIN to connect Number of records per person query executed successfully in 0.18 secs Connecting using Oracle driver using THIN to connect Vocabulary table successfully extracted Running Vocabulary Checks Connecting using Oracle driver using THIN to connect Failed see C:/CDM_OUTPUT/mapping_completenessErr.txt for more details Error in colnames<(*tmp*, value = c("Domain", "#Codes Source", "#Codes Mapped", : attempt to set 'colnames' on an object with less than two dimensions

As we can see in the file mapping_completenessErr.txt the error is:

DBMS: oracle

Error: java.sql.SQLSyntaxErrorException: ORA-00904: "SOURCE_VALUE": invalid identifier

SQL: SELECT 'Condition' as domain, COUNT(source_value) as n_codes_source, sum(is_mapped) as n_codes_mapped, 100.0sum(is_mapped) / COUNT() as p_codes_mapped, sum(num_records) as n_records_source, sum(case when is_mapped > 0 then num_records else 0 end) as n_records_mapped, 100.0sum(case when is_mapped > 0 then num_records else 0 end)/sum(num_records) as p_records_mapped FROM (SELECT condition_source_value, CASE when condition_concept_id > 0 then 1 else 0 end as is_mapped, COUNT() as num_records FROM CDMEHDEN.condition_occurrence GROUP BY condition_source_value, case when condition_concept_id > 0 then 1 else 0 END ) T UNION SELECT 'Procedure', COUNT(*),..........

The auto-generated script probably contains a syntax error, as it refers to the source_value field instead of the correct condition_source_value. If we fix the bug and run the corrected script on SQL developer then we get the expected result as follows:

image

Any suggestion?

MaximMoinat commented 10 months ago

Thanks for reporting, fixed in v1.2.1