OHDSI / DbDiagnostics

Package to profile a database and execute data diagnostics based on individual analysis settings
https://ohdsi.github.io/DbDiagnostics/
Apache License 2.0
6 stars 5 forks source link

Error seen when DbDiagnostics::executeDbProfile is run more than once #10

Closed NACHC-CAD closed 3 months ago

NACHC-CAD commented 1 year ago

When I run DbDiagnostics::executeDbProfile a second time I'm seeing this error:

org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException: 
Table demo_cdm_ach_res.h6tneh6js_tmpach_2125 already exists

My scripts and full error log are attached.

achillesError_2125.txt 00-databricks-keyring.R.txt 01-db-profile-config.R.txt 02-db-profile-execute.R.txt

NACHC-CAD commented 1 year ago

Not sure if this is a valid work around, but I was able to get the script to execute by doing the following: 1.) Delete the contents of the output folder 2.) Excecute the following in the Database (Databricks in this case):


drop table if exists demo_cdm_ach_res.dl274vzls_tmpach_dist_1815;
drop table if exists demo_cdm_ach_res.dl274vzlstatsView_1815;
drop table if exists demo_cdm_ach_res.dl274vzltempResults_1815;

drop table if exists demo_cdm_ach_res.dl274vzls_tmpach_2100;
drop table if exists demo_cdm_ach_res.dl274vzlstatsView_2100;
drop table if exists demo_cdm_ach_res.dl274vzltempResults_2100;

drop table if exists demo_cdm_ach_res.dl274vzls_tmpach_2101;
drop table if exists demo_cdm_ach_res.dl274vzlstatsView_2101;
drop table if exists demo_cdm_ach_res.dl274vzltempResults_2101;

drop table if exists demo_cdm_ach_res.dl274vzls_tmpach_2125;
drop table if exists demo_cdm_ach_res.dl274vzlstatsView_2125;
drop table if exists demo_cdm_ach_res.dl274vzltempResults_2125;

I'm not sure if the run was totally successful. I got the following error at the very end (full output is attached): r-output.txt

Temporary Achilles tables removed from schema #
Temporary Achilles tables removed from schema #
[Total Runtime] 2.162684 mins
[Total Runtime] 2.162684 mins
[Total Runtime] 2.162684 mins
An error occurred while the 'DatabaseConnector' package was updating the RStudio Connections pane:
Error in NULL: host must be a single element of type 'character'
If necessary, these warnings can be squelched by setting `options(rstudio.connectionObserver.errorsSuppressed = TRUE)`.
Connecting using Spark JDBC driver
Final results are now available in: D:\_YES_2023-05-28\workspace\SosExamples\_COVID\02-data-diagnostics\output/NACHC_DEMO_DB_OHDSI/20190525/DbProfileResults_NACHC_DEMO_DB_OH
DSI_20190525_20230809201539.zip
> 

A zip file was NOT created but the following 3 files were created (attached).
db_profile_results.csv db_profile_results_dist.csv NACHC_DEMO_DB_OHDSI_20190525_cdm_source.csv

db_profile_results.csv db_profile_results_dist.csv NACHC_DEMO_DB_OHDSI_20190525_cdm_source.csv

clairblacketer commented 3 months ago

Hi, so I think this is related to Achilles. This package calls Achilles to run the needed characterizations. It creates temp tables to facilitate the characterizations and I suspect something happened where the temp tables were not cleaned up.