morinlab / GAMBLR

Set of standardized functions to operate with genomic data
https://morinlab.github.io/GAMBLR/
MIT License
3 stars 2 forks source link

Bug in ashm_rainbow_plot #140

Closed mattssca closed 1 year ago

mattssca commented 1 year ago

While writing new examples for the SSM vignette, I discovered that the classification_column parameter does not work as advertised for this function. Or more specifically, this line in the function does not do what it is supposed to do:

meta_arranged$classification = factor(meta_arranged[,classification_column], levels = unique(meta_arranged[,classification_column]))

I am assuming that the goal here is to create a new column in the meta_arranged df called classification, that will be used by the colour parameter inside the ggplot argument later on. This code instead creates a new column filled with NAs (looking for a factor with the same name as the value for classification_column?) Resulting in the following plot (no colouring based on the selected classification column):

fl_dlbcl_rainbowplot

The example plot here was generated with the following code:

mybed = data.frame(start = 128747680,
                   end = 128753674,
                   name = "MYC")

region = "chr8:128737680-128763674"

fl_dlbcl_metadata = get_gambl_metadata() %>%
  dplyr::filter(pathology %in% c("FL", "DLBCL"))

my_mutations = get_ssm_by_region(region = region)

ashm_rainbow_plot(mutations_maf = my_mutations,
                  drop_unmutated = TRUE,
                  metadata = fl_dlbcl_metadata,
                  hide_ids = FALSE,
                  bed = mybed,
                  region = region,
                  classification_column = "pathology",
                  custom_colours = get_gambl_colours("pathology"))

In addition, it would be nice to have the function automatically subset the colours in the retrieved palette to the factors in the classification_column (FL and DLBCL, pathology).