OHDSI / CohortGenerator

An R package for instantiating cohorts using data in the CDM.
https://ohdsi.github.io/CohortGenerator/
11 stars 10 forks source link

Verbose Toggle for ParallelLogger in CohortGenerator #97

Closed mdlavallee92 closed 1 month ago

mdlavallee92 commented 1 year ago

First off, love CohortGenerator, it is super handy in studies! Nice work @anthonysena

I am trying to use CohortGenerator in markdown chunks in a way to run interactive OHDSI pipelines. However a problem I have is that ParallelLogger conflicts with the globalCallingHandlers of the markdown chunk. This is a known problem for any logger package in markdown and is not a bug in either package. To avoid this error when running markdown chunks, I currently wrap the CohortGenerator functions in purrr::quietly to collect the output as a message.

Is it within the development roadmap to add a feature to the CohortGenerator functions that offers a toggle, such as verbose = TRUE, that can turn on or off the ParallelLogger within the function?

For example:

# Example of CohortGenerator with verbose toggle
createCohortTables <- function (connectionDetails = NULL, connection = NULL, cohortDatabaseSchema, 
                                cohortTableNames = getCohortTableNames(), incremental = FALSE, verbose = TRUE) 
{
  if (is.null(connection) && is.null(connectionDetails)) {
    stop("You must provide either a database connection or the connection details.")
  }
  start <- Sys.time()
  if (is.null(connection)) {
    connection <- DatabaseConnector::connect(connectionDetails)
    on.exit(DatabaseConnector::disconnect(connection))
  }
  createTableFlagList <- lapply(cohortTableNames, FUN = function(x) {
    x <- TRUE
  })
  if (incremental) {
    tables <- DatabaseConnector::getTableNames(connection, 
                                               cohortDatabaseSchema)
    for (i in 1:length(cohortTableNames)) {
      if (toupper(cohortTableNames[i]) %in% toupper(tables)) {
        createTableFlagList[i] <- FALSE
        if (verbose) {
          ParallelLogger::logInfo("Table \"", cohortTableNames[i], 
                                  "\" already exists and in incremental mode, so not recreating it.")
        }

      }
    }
  }
  if (any(unlist(createTableFlagList, use.names = FALSE))) {
    if (verbose) {
      ParallelLogger::logInfo("Creating cohort tables")
    }
    sql <- SqlRender::readSql(system.file("sql/sql_server/CreateCohortTables.sql", 
                                          package = "CohortGenerator", mustWork = TRUE))
    sql <- SqlRender::render(sql = sql, cohort_database_schema = cohortDatabaseSchema, 
                             create_cohort_table = createTableFlagList$cohortTable, 
                             create_cohort_inclusion_table = createTableFlagList$cohortInclusionTable, 
                             create_cohort_inclusion_result_table = createTableFlagList$cohortInclusionResultTable, 
                             create_cohort_inclusion_stats_table = createTableFlagList$cohortInclusionStatsTable, 
                             create_cohort_summary_stats_table = createTableFlagList$cohortSummaryStatsTable, 
                             create_cohort_censor_stats_table = createTableFlagList$cohortCensorStatsTable, 
                             cohort_table = cohortTableNames$cohortTable, cohort_inclusion_table = cohortTableNames$cohortInclusionTable, 
                             cohort_inclusion_result_table = cohortTableNames$cohortInclusionResultTable, 
                             cohort_inclusion_stats_table = cohortTableNames$cohortInclusionStatsTable, 
                             cohort_summary_stats_table = cohortTableNames$cohortSummaryStatsTable, 
                             cohort_censor_stats_table = cohortTableNames$cohortCensorStatsTable, 
                             warnOnMissingParameters = TRUE)
    sql <- SqlRender::translate(sql = sql, targetDialect = connection@dbms)
    DatabaseConnector::executeSql(connection, sql, progressBar = FALSE, 
                                  reportOverallTime = FALSE)
    logCreateTableMessage <- function(schema, tableName) {
      ParallelLogger::logInfo("- Created table ", 
                              schema, ".", tableName)
    }
    if (verbose) {
      for (i in 1:length(createTableFlagList)) {
        if (createTableFlagList[[i]]) {
          logCreateTableMessage(schema = cohortDatabaseSchema, 
                                tableName = cohortTableNames[i])
        }
      }
    }
    delta <- Sys.time() - start
    if (verbose) {
      ParallelLogger::logInfo("Creating cohort tables took ", 
                              round(delta, 2), attr(delta, "units"))
    }
  }
 invisible(delta)
}
anthonysena commented 1 year ago

Thanks @mdlavallee92 - appreciate the feedback! I've faced the same issues regarding ParallelLogger and RMarkdown.

I like the idea of having more control over the messaging as you have proposed. I've also considered removing the use of ParallelLogger for messaging and just using the native message functions. I'll put this into ideas for the next CG release.

mdlavallee92 commented 1 year ago

Cool! Let me know how I can help!

anthonysena commented 1 month ago

@mdlavallee92 - I've made some changes on the develop branch that should address this and will be part of the v0.10 release. I'll close this for now but if you face any problems let me know and I'll re-open. Thanks!