MSKCC-Epi-Bio / gnomeR

Package to wrangle and visualize genomic data in R
https://mskcc-epi-bio.github.io/gnomeR/
Other
26 stars 19 forks source link

allow custom alias tables to be provided #268

Closed hfuchs5 closed 1 year ago

hfuchs5 commented 1 year ago

What changes are proposed in this pull request? I needed to provide my own list of aliases, so I slightly changed the structure of resolve_alias()

If there is an GitHub issue associated with this pull request, please provide link. Details in PR https://github.com/MSKCC-Epi-Bio/gnomeR/pull/221

225


Reviewer Checklist (if item does not apply, mark is as complete)

When the branch is ready to be merged into master:

karissawhiting commented 1 year ago

@hfuchs5 I think this part:

  # make tibbles into data.frames
  if ("tbl" %in% class(alias_table)) {
    alias_table <- as.data.frame(alias_table)
  }

  alias_table <- switch(
    class(alias_table),
    "character" = {
      choices_arg <- c("impact", "IMPACT")
      lc = tolower(match.arg(alias_table, choices = choices_arg))
      switch(alias_table, "impact" = gnomeR::impact_alias_table)
      },

    "data.frame" = {
      .check_required_cols(alias_table, "hugo_symbol", "alias")
      alias_table
    })

  # make sure there is one gene per row
  if (is.character(alias_table$alias)) {
    if (any(stringr::str_detect(alias_table$alias, ","))) {
      cli::cli_abort(
        c("Error with {.code alias_table}. Are there multiple genes per row? You must provide a data frame with one gene-alias pair per row."),
        c("See {.code gnomeR::impact_alias_table} for an example on how to format data.")
      )
    }
  } else {
    cli::cli_abort("Error with {.code alias_table}. Did you provide a dataframe that has columns {.code hugo_symbol} and {.code alias}?")
  }

  # select only needed cols
  alias_table <- alias_table %>%
    dplyr::select("hugo_symbol", "alias")

Could be separated into its own argument checking function if we want to simplify (but not a priority)

Also I think we could add a few tests testing this within the create_gene_binary() context. In that context you can use "no" as well. in recode_alias() you can't use "no".