morinlab / GAMBLR

Set of standardized functions to operate with genomic data
https://morinlab.github.io/GAMBLR/
MIT License
3 stars 2 forks source link

plot updates, vignette & bug fixes, etc. #85

Closed mattssca closed 2 years ago

mattssca commented 2 years ago

Changes in this PR include the following updates:

  1. fancy_cnlohbar - updated colours (to match gambl palette for CN states), optional parameter to include CN state = 2, adding a second y-axis to annotate the number of nucleotides affected by each CN state. Both y-axis are now in log10 (since CN = 2 is now included). nNucleotides in each CN state are now plotted as geom_points over the corresponding CN count bar.

  2. fancy_sv_chrdistplot - Now uses the newly added palette for DEL and INS.

  3. fancysnv chrdistplot - Name and description of this plot have now been adjusted for SNVs (previously stated SNPs).

  4. fancy_svbar - Now uses the newly added palette for DEL and INS.

  5. fancy_vplot - Now uses the newly added palette for DEL and INS.

  6. fancy_ideogram - Added in this PR. Colours have also been updated to match the defined palette for CN states. Previously, the plot annotated CN states up to 10+, but since the defined colour palette only goes up to CN state = 6, thus all CN states => 6 are annotated the same.

  7. New palette added to get_gambl_colours, all_colours[["indels"]] = c("DEL" = "#53B1FC", "INS" = "#FC9C6D").

  8. Updating get_codin_ssm to have these_samples_metadata parameter

  9. Fixing bug in get_ssm_bt_region (caused an error in ashm_rainbow_plot in the vignette). Were Streamlined = TRUE was wrongly embedded inside the function, causing the function to always return only two columns. This was removed, streamlined and basic_columns were also set to FALSE by default.

  10. lollipopPlot example in the vignette was updated to use get_coding_ssm (instead of get_ssm_by_gene), now works the intended way again.

  11. Regenerated package documentation.

  12. A lot of changes related to removing trailing white space in all scripts (no new code in portal.R, preprocessing_io.R and web.R).

Outstanding things that will be addressed in my next PR are examples in the vignette that are relying on the database for reading data. Examples of newly incorporated plotting functions will also be added.

Pull Request Checklists

Important: When opening a pull request, keep only the applicable checklist and delete all other sections.

Checklist for all PRs

Required

This can be checked and addressed by running check_functions.pl and responding to the prompts. Test your code after you do this.

Optional but preferred with PRs

Checklist for New Functions

Required

Example:

#' Use GISTIC2.0 scores output to reproduce maftools::chromoplot with more flexibility
#'
#' @param scores output file scores.gistic from the run of GISTIC2.0
#' @param genes_to_label optional. Provide a data frame of genes to label (if mutated). The first 3 columns must contain chromosome, start, and end coordinates. Another required column must contain gene names and be named `gene`. (truncated for example)
#' @param cutoff optional. Used to determine which regions to color as aberrant. Must be float in the range [0-1]. (truncated for example)

Example:

#' @return nothing
#' @export
#' @import tidyverse ggrepel

Checklist for changes to existing code

mattssca commented 2 years ago

In this commit the following updates have been made:

mattssca commented 2 years ago

Changes in this commit include fixes to issues/suggestions raised in the recent PR review.

  1. Use global colours - The following plotting function have been updated to read palette from get_gambl_colours(): fancy_sv_chrdistplot,fancy_sv_bar, fancy_vplot, fancy_ideogram, fancy_multisamp_ideogram.

  2. Renaming of variable returned after calling get_sample_cn_segments in fancy_cnbar (not a maf returned)

  3. New parameter was added to fancy_cnbar, allowing the user to specify the cut-off value for maximum CN states to be retrieved.

  4. CN0 is now retained in fancy_cnbar.

  5. Chromosome tables for both ideograms are updated, now using GAMBLR::chromosome_arms_grch37 to retrieve this information.

  6. fancy_ideogram now stores the plot inside a variable (p) allowing the user to combine multiple plots when arranging a multiplot figure.

  7. fancy_ideogram is now also using the newly added helper function (subset_cnstates).

  8. Above mentioned helper function is also simplified to a unified call.

  9. Overwriting line (all_meta) to database.R has been removed.

  10. Package documentation has been regenerated to reflect added/updated parameters.