ohdsi-studies / SvvEhden

Aggregation of data for the SVVEHDEN Study-A-Thon.
0 stars 0 forks source link

Error running on IPCI #1

Open sdotter opened 2 years ago

sdotter commented 2 years ago

I tried to run this on IPCI but after installing al packages en setting up configuration (database etc)... it did run for a while but after a long time got this error:

err

`Error: 'D:/SvvEhden-master/export/chronograph_data.csv' does not exist.

stop("'", path, "' does not exist", if (!is_absolute_path(path)) { paste0(" in current working directory ('", getwd(), "')") }, ".", call. = FALSE) check_path(path) standardise_path(file) datasource(file, skip = skip, skip_empty_rows = FALSE, skip_quote = FALSE) read_lines_raw(file, n_max = n_max) unlist(read_lines_raw(file, n_max = n_max)) readr::guess_encoding(file = fileName, n_max = min(1e+07)) CohortDiagnostics:::checkInputFileEncoding(ChronographCsv) launchDiagnosticsExplorerOutsidePackage(dataFolder = file.path(path_to_project_root, exportFolder_after_slash), DecList = decListFile, ChronographCsv = file.path(path_to_project_root, exportFolder_after_slash, "chronograph_data.csv"), TarOptions = fixed_TARs, connectionDetails = NULL, verbose = FALSE) `

OskarGauffin commented 2 years ago

Thank you for testing sdotter, we really appreciate it. Sorry for the late response, we've been on summer holiday.

We just started trying to reproduce this bug, and we'll get back to you when we know more.

OskarGauffin commented 2 years ago

Hello again,

We've tried a few setups but we can't seem to reproduce this error.

Would you be able to provide us with some additional information?

For instance, if you look into the folders of the package, for instance the export folder, and perhaps the root folder of the package, do you find a "cronograph_data.csv"-file somewhere?

Does the problem persist if you set the "run_cohort_diagnostics_shiny_interface = TRUE" to FALSE instead (i.e. does this happen when opening the shiny interface?)

For faster debugging, you can set the "testset" variable to "smalltestset" instead of "mediumtestset", to save yourself some time.

Hope this is helpful

OskarGauffin commented 2 years ago

Given that other testers seem to have made it further down the pipeline, would it be possible to check that this ("Error: 'D:/SvvEhden-master/export/chronograph_data.csv' ") is indeed the first error you encounter?

sdotter commented 2 years ago

Hi Oskar! I'm gonna test it again tomorrow when i am at the office... Thanks!

OskarGauffin commented 2 years ago

Thanks Sicco, We've updated with the latest bug-fix, please test again when you have time.

sdotter commented 2 years ago

testing in progress... i'll report back when ive got some results ;)

sdotter commented 2 years ago

Hi Oskar!

First i had to change https://github.com/ohdsi-studies/SvvEhden/blob/17c53c8674d5e73c8a2b2d62f07c98d39885e570/extras/RunCohortDiagnosticsAndViewResult.R#L56

Removing type="win.binary" arg fixed it... Then in continued but after a long time i get another error:

DBMS: postgresql

Error: java.lang.NoClassDefFoundError: org/postgresql/core/v3/QueryExecutorImpl$1

SQL: -- return all the drug cohorts to be used to construct the json file select distinct * from ( SELECT distinct C.concept_id, C.concept_code, C.concept_name FROM cdm.drug_exposure DE INNER JOIN cdm.concept C ON C.concept_id = DE.drug_concept_id where C.concept_id != 0 and C.vocabulary_id = 'RxNorm' and C.concept_class_id = 'Ingredient' and C.standard_concept ='S' and C.invalid_reason is null union SELECT distinct C.concept_id, C.concept_code, C.concept_name FROM cdm.drug_exposure DE INNER JOIN cdm.CONCEPT_ANCESTOR CA ON CA.descendant_concept_id = DE.drug_concept_id INNER JOIN cdm.concept C ON C.concept_id = CA.ancestor_concept_id where C.concept_id != 0 and C.vocabulary_id = 'RxNorm' and C.concept_class_id = 'Ingredient' and C.standard_concept ='S' and C.invalid_reason is null )U

R version: R version 4.0.5 (2021-03-31)

Platform: x86_64-w64-mingw32

Attached base packages:

Other attached packages:

OskarGauffin commented 2 years ago

Hi Sicco!

This is a completely new bug for us, probably somehow related to postgresql. We'll have to look into it, thanks!

OskarGauffin commented 2 years ago

Dear Sicco,

I've examined the error log. It seems as if it's running a relatively short query (20 lines) that's causing the error. The script is placed at inst/sql server/generate_all_drug_concepts.sql.

The intended output is a three column table with conceptID, concept code and concept name:

image

I've included some R-code that should run that exact query, if you could try and run it (update the first argument) and see if this gives you any further input on what the issue might be. For instance, does this isolated query run, if not, do you see anything in the translated sql code that you think might be the problem?

Thanks for your time and help,

`cdm_database_schema="Sicco_IPCI_schema"

sql <- "select distinct * from ( SELECT distinct C.concept_id, C.concept_code, C.concept_name FROM @cdm_database_schema.drug_exposure DE INNER JOIN @cdm_database_schema.concept C ON C.concept_id = DE.drug_concept_id where C.concept_id != 0 and C.vocabulary_id = 'RxNorm' and C.concept_class_id = 'Ingredient' and C.standard_concept ='S' and C.invalid_reason is null

union

SELECT distinct C.concept_id, C.concept_code, C.concept_name FROM @cdm_database_schema.drug_exposure DE INNER JOIN @cdm_database_schema.CONCEPT_ANCESTOR CA ON CA.descendant_concept_id = DE.drug_concept_id INNER JOIN @cdm_database_schema.concept C ON C.concept_id = CA.ancestor_concept_id where C.concept_id != 0 and C.vocabulary_id = 'RxNorm' and C.concept_class_id = 'Ingredient' and C.standard_concept ='S' and C.invalid_reason is null ) U"

sql <- SqlRender::render(sql, cdm_database_schema = cdm_database_schema) sql <- SqlRender::translate(sql, targetDialect = "postgresql")

DatabaseConnector::querySql(connection, sql)`

sdotter commented 2 years ago

Hi Oskar! Awesome! Thanks... i'll gonna run it tomorrow when i'm at the office... I'll report back then again as soon as i've a successfully run the package ;) Was by the way indeed better to use small test set for testing...
You hear from me tomorrow!

OskarGauffin commented 2 years ago

Great! Please note that we updated the list of drug-event-combinations to the actual one to be used for the study-a-thon, which is called "20" in the script, i.e. "smalltestset" is not the one that we really care about anymore, it's the 20.

But still, if you started it and execute succesfully on smalltestset, that would be great news as well.

sdotter commented 2 years ago

Hi Oskar,

Ive doing some testing... It seems that the query is running fine (a least when im executing sql in datagrip or pgadmin).... But when running with the 20... it still is doing a lot of work because after few hours again same error.

Also checked out cohors_username_cd table: count(*) and there are many rows in there: 23246478 rows....

So i thought ill do and make instead of 20 a 3 + and neccesary files in settings dir (CohortsToCreate_3_1.csv + CohortsToCreate_3_2.csv + CohortsToCreate_3_3.csv + DecList_3.csv) because i thought it would be less data(?) But that is still busy....