oxford-pharmacoepi / MegaStudy

4 stars 2 forks source link

columns not found in cdm table cdm_source #6

Closed yichienlee2 closed 8 months ago

yichienlee2 commented 9 months ago

I am trying to run the feasibility code but got the error message saying lots of columns are not found in cdm table cdm_source. I checked my connection is working and those columns are in the CDM_SOURCE table. Do you know is it because my tables are in upper case?

Below is the relevant log - any help is appreciated!

parameters to connect to create cdm object

cdmSchema <- "IQVIA_OMOP_EHR_FRANCE_202311" # schema where cdm tables are located writeSchema <- "TEAM_RWDPROGEM" # schema with writing permission writePrefix <- "france" # combination of at least 5 letters + (eg. "abcde") that will lead any table written in the cdm dbName <- "IQVIA_OMOP_EHR_FRANCE" # name of the database, use acronym in capital letters (eg. "CPRD GOLD")

Run the study

source("E:/doc/internal/OMOP/Megastudy/Feasibility/study_package/MegaStudy/Feasibility Code/RunFeasibility.R") Error in assert_tables(): ! - cdm_source_name, cdm_source_abbreviation, cdm_holder, source_description, source_documentation_reference, cdm_etl_reference, source_release_date, cdm_release_date, cdm_version, vocabulary_version columns not found in cdm table cdm_source Run rlang::last_trace() to see where the error occurred. rlang::last_trace() <error/rlang_error> Error in assert_tables(): ! - cdm_source_name, cdm_source_abbreviation, cdm_holder, source_description, source_documentation_reference, cdm_etl_reference, source_release_date, cdm_release_date, cdm_version, vocabulary_version columns not found in cdm table cdm_source

Backtrace: ▆

  1. ├─base::source("E:/doc/internal/OMOP/Megastudy/Feasibility/study_package/MegaStudy/Feasibility Code/RunFeasibility.R")
  2. │ ├─base::withVisible(eval(ei, envir))
  3. │ └─base::eval(ei, envir)
  4. │ └─base::eval(ei, envir)
  5. ├─utils::write.csv(...) at MegaStudy/Feasibility Code/RunFeasibility.R:27:1
  6. │ └─base::eval.parent(Call)
  7. │ └─base::eval(expr, p)
  8. │ └─base::eval(expr, p)
  9. ├─utils::write.table(...)
  10. │ └─base::is.data.frame(x)
  11. └─CDMConnector::snapshot(cdm)
  12. └─CDMConnector::assert_tables(cdm, tables = c("cdm_source", "vocabulary"), empty.ok = TRUE) Run rlang::last_trace(drop = FALSE) to see 1 hidden frame.

Check connection and make sure the columnd are there

codes <- dbGetQuery(db,"select cdm_source_name, cdm_source_abbreviation, cdm_holder, source_description, source_documentation_reference, cdm_etl_reference, source_release_date, cdm_release_date, cdm_version, vocabulary_version

  • FROM IQVIA_OMOP_EHR_FRANCE_202311.cdm_source") codes CDM_SOURCE_NAME CDM_SOURCE_ABBREVIATION CDM_HOLDER 1 France Disease Analyzer France DA IQVIA SOURCE_DESCRIPTION SOURCE_DOCUMENTATION_REFERENCE 1 France DA dataset contains office visits, prescribed medications and vaccinations, test results and diagnosis (ICD10) CDM_ETL_REFERENCE SOURCE_RELEASE_DATE CDM_RELEASE_DATE CDM_VERSION VOCABULARY_VERSION 1 https://github.com/OHDSI/CommonDataModel/releases/tag/v5.3.1 2023-09-30 2023-11-16 v5.3.1 v5.0 31-AUG-23
tiozab commented 9 months ago

Hi @yichienlee2 , after putting all the parameters and connecting to the database, can you run the following and show me the output? cdm <- cdmFromCon( con = db, cdmSchema = c(schema = cdmSchema), writeSchema = c(schema = writeSchema, prefix = writePrefix), cdmName = dbName ) cdm$cdm_source

yichienlee2 commented 9 months ago

Thank you for your help! I got the output below:

cdm <- cdmFromCon(

  • con = db,
  • cdmSchema = c(schema = cdmSchema),
  • writeSchema = c(schema = writeSchema, prefix = writePrefix),
  • cdmName = dbName
  • ) cdm$cdm_source

    Source: SQL [1 x 10]

    Database: Snowflake 7.43.0[@Snowflake/RWD_PROD]

    CDM_SOURCE_NAME CDM_SOURCE_ABBREVIATION CDM_HOLDER SOURCE_DESCRIPTION SOURCE_DOCUMENTATION…¹ CDM_ETL_REFERENCE SOURCE_RELEASE_DATE CDM_RELEASE_DATE CDM_VERSION

    1 France Disease Analyzer France DA IQVIA France DA dataset contains… NA https://github.c… 2023-09-30 2023-11-16 v5.3.1 # ℹ abbreviated name: ¹​SOURCE_DOCUMENTATION_REFERENCE # ℹ 1 more variable: VOCABULARY_VERSION
yichienlee2 commented 9 months ago

This is the screenshot - seems the text output is hard to read

image

tiozab commented 9 months ago

@ablack3 all upper case should not be an issue? because CDMConnector allows all lower or all upper case?

ablack3 commented 9 months ago

As far as I know column names are supposed to be lowercase. Otherwise things get very tricky because all queries need to be modified based on case of the column names. I thought we decided that in Darwin column names would be lowercase. Is this right @edward-burn ?

Table names can be upper or lower case though.

ablack3 commented 9 months ago

I understand that the goal is for the code to work and it is good you found a workaround. But it would help if we know what to expect from the database. Are we supporting mixed casing in tables or not? These checks and errors exist to ensure that downstream code works properly. We just need a clear specification of what our packages can expect from the CDM.

I could easily delete that check but then we open ourselves up to other errors on some database systems.

tiozab commented 9 months ago

@ablack3 I deleted my comment, it was about table names not column names, we have not found a solution yet

ablack3 commented 9 months ago

Yea I think in Darwin we are requiring column names to be lowercase. You can make the database case insensitive. But if you the database requires column names to be uppercase I think the code will fail. The only other alternative is to selectively cast all the column names to either upper or lowercase for each database which could possibly be done but would take some work to implement and add complexity to the code I think.

Is there any chance you can use lowercase column names in your CDM or make the database case insensitive?

yichienlee2 commented 9 months ago

Thanks for all the help - I will talk to our database team to see if they can modify the column names to lower case.

ablack3 commented 9 months ago

Thanks @yichienlee2! let us know how it goes. I know it is a pain but if we set all the CDMs up the same way it will make execution much easier. Otherwise the software has to work around lots of different potential cases.

yichienlee2 commented 9 months ago

I talked to our colleague and we are trying to make database case-insensitive as the first attempt. I used the following code:

codes <- dbSendQuery(db, "ALTER SESSION SET QUOTED_IDENTIFIERS_IGNORE_CASE =TRUE;") dbFetch(codes)

It seems working because I am getting a different error now! Pasted below in case it's obvious for you - but will keep debugging from my end. Thanks.

source("E:/doc/internal/OMOP/Megastudy/Feasibility/study_package/MegaStudy/Feasibility Code/RunFeasibility.R") Error in purrr::map_dbl(): ℹ In index: 1. Caused by error: ! Result must be length 1, not 0. Run rlang::last_trace() to see where the error occurred. rlang::last_trace() <error/purrr_error_indexed> Error in purrr::map_dbl(): ℹ In index: 1. Caused by error: ! Result must be length 1, not 0.

Backtrace: ▆

  1. ├─base::source("E:/doc/internal/OMOP/Megastudy/Feasibility/study_package/MegaStudy/Feasibility Code/RunFeasibility.R")
  2. │ ├─base::withVisible(eval(ei, envir))
  3. │ └─base::eval(ei, envir)
  4. │ └─base::eval(ei, envir)
  5. ├─utils::write.csv(...) at MegaStudy/Feasibility Code/RunFeasibility.R:27:1
  6. │ └─base::eval.parent(Call)
  7. │ └─base::eval(expr, p)
  8. │ └─base::eval(expr, p)
  9. ├─utils::write.table(...)
  10. │ └─base::is.data.frame(x)
  11. ├─CDMConnector::snapshot(cdm)
  12. │ └─CDMConnector::assert_tables(cdm, tables = c("person", "observation_period"))
  13. │ ├─... %>% rlang::set_names(existingTables)
  14. │ └─purrr::map_dbl(...)
  15. │ └─purrr:::map_("double", .x, .f, ..., .progress = .progress)
  16. │ ├─purrr:::with_indexed_errors(...)
  17. │ │ └─base::withCallingHandlers(...)
  18. │ └─purrr:::call_with_cleanup(...)
  19. └─rlang::set_names(., existingTables) Run rlang::last_trace(drop = FALSE) to see 4 hidden frames.
yichienlee2 commented 9 months ago

This issue seems related to upper case of schema name. If I change cdmschema to lower case, it gives me a different error. I am checking with our database team to see if there are other command to take care of this issue.

cdmSchema <- "iqvia_omop_ehr_france_202311" # schema where cdm tables are located

Run the study

source("E:/doc/internal/OMOP/Megastudy/Feasibility/study_package/MegaStudy/Feasibility Code/RunFeasibility.R") Error in cdm_from_con() at CDMConnector/R/cdm.R:224:2: ! There were no cdm tables found in the cdm_schema! Run rlang::last_trace() to see where the error occurred.

tiozab commented 9 months ago

@ablack3 any comments? to the two mentioned "new" errors above

tiozab commented 8 months ago

updated_r_code_files.zip from @carina-dk with the following information: Using CDMConnector 0.5.1 [most likely] resolved the lower/upper case issue.

Two changes were required for the setup of our Snowflake database: • the date format needed to be changed to match that used by the Snowflake database • the code needed to be adapted to account for our setup where the scratch schema with writing rights is outside the CDM schema.

Hopefully this will help you @yichienlee2

yichienlee2 commented 8 months ago

Thank you and @Katrin-MLK for sharing - I tried to switched to CDMConnector 0.5.1 but encountered a new issue. We do not have the DEATH table in our database. Is this a critical violation to run this feasibility?

Below is the full error message:

Error in purrr::map() at rlang/R/attr.R:194:2: ℹ In index: 9. Caused by error in db_query_fields.DBIConnection() at rlang/R/eval.R:96:2: ! Can't query fields. Caused by error: ! nanodbc/nanodbc.cpp:1526: 42S02: SQL compilation error: Object 'RWD_PROD.IQVIA_OMOP_EHR_FRANCE_202311.DEATH' does not exist or not authorized. Run rlang::last_trace() to see where the error occurred.

tiozab commented 8 months ago

@yichienlee2 you do not need the death table to run the feasibility.

yichienlee2 commented 8 months ago

thank you! I made a few tweaks and was able to workaround this error message. I can now run the package smoothly. Will work with our colleagues to upload the results this week.