OHDSI / Achilles

Automated Characterization of Health Information at Large-scale Longitudinal Evidence Systems (ACHILLES) - descriptive statistics about a OMOP CDM database
https://ohdsi.github.io/Achilles/
130 stars 122 forks source link

Division by zero error when running on Postgres #570

Closed aasiyahrashan closed 3 years ago

aasiyahrashan commented 3 years ago

I tried running Achilles on v5.3.1 Postgres CDM.

My original command was

achilles(connectionDetails, 
         cdmDatabaseSchema = "our_omop", 
         resultsDatabaseSchema="our_omop.achilles_results",
         numThreads = 1,
         sourceName = "our_data", 
         cdmVersion = "5.3",
         runHeel = FALSE,
         runCostAnalysis = FALSE,
         outputFolder = "achilles_output")

Actual behavior

I get this error. Error executing SQL: org.postgresql.util.PSQLException: ERROR: relation "achilles_results.achilles_results" does not exist An error report has been created at

The log file shows this error:

Analysis 431 -- ERROR Error: Error executing SQL: org.postgresql.util.PSQLException: ERROR: division by zero An error report has been created at

The same error is repeated for the following analysis IDs: 432, 631, 632, 731, 732, 831, 832, 931, 932, 1031, 1032. They correspond to tables which are currently empty in our schema.

I'd appreciate any help in resolving this. Thanks.

AnthonyMolinaro commented 3 years ago

@aasiyahrashan Hi, thanks for bringing this to our attention. This is happening because several of your event tables are empty. Specifically, condition_occurrence, procedure_occurrence, drug_exposure, observation, drug_era, and condition_era are all empty. In this case, the best practice is to apply a category filter and run Achilles only for those tables that are non-empty. Here's an example given the tables that are currently empty in your schema:

library(Achilles)
allAnalyses <- Achilles::getAnalysisDetails()
unique(allAnalyses$CATEGORY)  # use this to see all the possible categories

categoriesToExclude <- c(
  "Condition Occurrence","Procedure Occurrence","Drug Exposure",
  "Observation","Drug Era","Condition Era"
  )

excludeIndices <- which(allAnalyses$CATEGORY %in% categoriesToExclude)

analysesToInclude <- allAnalyses[-excludeIndices,]$ANALYSIS_ID

# call achilles() with the analysisIds parameter set
Achilles::achilles(connectionDetails, 
         cdmDatabaseSchema = "our_omop", 
         resultsDatabaseSchema="our_omop.achilles_results",
         numThreads = 1,
         analysisIds = analysesToInclude,
         sourceName = "our_data", 
         cdmVersion = "5.3",
         runHeel = FALSE,
         runCostAnalysis = FALSE,
         outputFolder = "achilles_output")

Let us know if this works for you.

aasiyahrashan commented 3 years ago

Hi,

Thanks for your help. It didn't work because the analyses with IDs *31 and *32 fall under the observation period category even though they reference the empty tables. I don't want to exclude the observation period category from the checks since I have filled it in.

AnthonyMolinaro commented 3 years ago

@aasiyahrashan That's fine, you just need to specify which analyses to run. For example, if you want to see the Observation analyses and OP analyses side by side, just pull them using code similar to the above. For example:

observationAnalyses       <- allAnalyses[allAnalyses$CATEGORY == "Observation","ANALYSIS_ID"]
observationPeriodAnalyses <- allAnalyses[allAnalyses$CATEGORY == "Observation Period","ANALYSIS_ID"]
padding                   <- rep(NA,length(observationPeriodAnalyses)-length(observationAnalyses))

cbind(
  OBSERVATION_ANALYSES = c(observationAnalyses,padding),
  OBSERVATION_PERIOD_ANALYSES = observationPeriodAnalyses
)

      OBSERVATION_ANALYSES OBSERVATION_PERIOD_ANALYSES
 [1,]                  800                         103
 [2,]                  801                         104
 [3,]                  802                         105
 [4,]                  803                         106
 [5,]                  804                         107
 [6,]                  805                         108
 [7,]                  806                         109
 [8,]                  807                         110
 [9,]                  809                         111
[10,]                  812                         112
[11,]                  813                         113
[12,]                  814                         114
[13,]                  815                         115
[14,]                  820                         116
[15,]                  822                         117
[16,]                  823                         118
[17,]                  824                         119
[18,]                  825                         210
[19,]                  826                         230
[20,]                  827                         231
[21,]                  891                         232
[22,]                   NA                         410
[23,]                   NA                         430
[24,]                   NA                         431
[25,]                   NA                         432
[26,]                   NA                         510
[27,]                   NA                         530
[28,]                   NA                         531
[29,]                   NA                         532
[30,]                   NA                         610
[31,]                   NA                         630
[32,]                   NA                         631
[33,]                   NA                         632
[34,]                   NA                         710
[35,]                   NA                         730
[36,]                   NA                         731
[37,]                   NA                         732
[38,]                   NA                         810
[39,]                   NA                         830
[40,]                   NA                         831
[41,]                   NA                         832
[42,]                   NA                         910
[43,]                   NA                         930
[44,]                   NA                         931
[45,]                   NA                         932
[46,]                   NA                        1010
[47,]                   NA                        1030
[48,]                   NA                        1031
[49,]                   NA                        1032
[50,]                   NA                        1310
[51,]                   NA                        1330
[52,]                   NA                        1331
[53,]                   NA                        1332
[54,]                   NA                        1810
[55,]                   NA                        1830
[56,]                   NA                        1831
[57,]                   NA                        1832
[58,]                   NA                        2110
[59,]                   NA                        2130
[60,]                   NA                        2131
[61,]                   NA                        2132

There aren't that many analyses. Just eyeball the lists and keep what you want to run.

aasiyahrashan commented 3 years ago

Ok thanks. Will do.

Is there any chance you can add a warning to the documentation saying that some analyses will fail if all tables are not filled? I think the tables I listed are allowed to be empty in v5. Apologies if I've missed any existing docs.

aasiyahrashan commented 3 years ago

I added this filter before running achilles, and now get this error.

analysesToInclude <- allAnalyses[!allAnalyses$ANALYSIS_ID
                                 %in% c(431, 432, 631, 632,
                                        731, 732, 831, 832, 931,
                                        932, 1031, 1032),]

Error executing SQL: org.postgresql.util.PSQLException: ERROR: syntax error at or near ")" Position: 89 An error report has been created at ../achillesErrorCreateAnalysis.txt

The error report says

DBMS: postgresql

Error: org.postgresql.util.PSQLException: ERROR: syntax error at or near ")" Position: 89

SQL: CREATE TABLE cca_omop.achilles_results.ACHILLES_analysis

AS WITH cte_analyses AS (

) SELECT analysis_id, analysis_name, stratum_1_name, stratum_2_name, stratum_3_name, stratum_4_name, stratum_5_name, is_default, category

FROM cte_analyses

R version: R version 3.6.1 (2019-07-05)

Platform: x86_64-apple-darwin15.6.0

Attached base packages:

  • stats
  • graphics
  • grDevices
  • utils
  • datasets
  • methods
  • base

Other attached packages:

  • Achilles (1.6.7)
  • DatabaseConnector (2.4.1)
AnthonyMolinaro commented 3 years ago

The analysisIds parameter is expecting a vector, not a data frame. Try this:

excludeThese <- c(431, 432, 631, 632,731, 732, 831, 832, 931,932, 1031, 1032)
analysesToInclude <- allAnalyses[which(!allAnalyses$ANALYSIS_ID %in% excludeThese),]$ANALYSIS_ID
aasiyahrashan commented 3 years ago

It worked! Thanks.