epiverse-trace / cleanepi

R package to clean and standardize epidemiological data
https://epiverse-trace.github.io/cleanepi/
Other
8 stars 3 forks source link

clean_col_names() factor comparison errors #11

Closed bahadzie closed 1 year ago

bahadzie commented 1 year ago

There is an error that comes up when trying to run this in R3.6.3. This is resolved if the colnames are compared as characters instead of factors.

clean_col_names <- function(x, report = list()) {
  original_names <- col_names <- colnames(x)
  # standardize the column names
  col_names <- snakecase::to_snake_case(col_names)
  cleaned_names <- epitrix::clean_labels(col_names)

  # make column name unique
  unique_names <- make.unique(cleaned_names, sep = "_")

  # detect modified column names from the previous command
  xx <- data.frame(cbind(
    original_name = original_names,
    new_name = unique_names
  )) %>%
    dplyr::mutate(
      original_name = as.character(original_name),
      new_name = as.character(new_name)
    )
  idx <- which(xx$original_name != xx$new_name)
  if (length(idx) > 0) {
    report[["modified_column_names"]] <- xx[idx, ]
  }

  list(
    data = x,
    report = report
  )
}

Karim-Mane commented 1 year ago

Thank you @bahadzie for raising this.

I have accounted for this.