morinlab / GAMBLR

Set of standardized functions to operate with genomic data
https://morinlab.github.io/GAMBLR/
MIT License
3 stars 2 forks source link

get_manta_sv, id_ease, website & documentation improvements + minor hot-fixes #217

Closed mattssca closed 1 year ago

mattssca commented 1 year ago

This PR includes the following updates;

  1. Where applicable, ORCIDs have been added to the DESCRIPTION. Allowing GAMBLR website visitors to click on any of the authors and browse the associated publications from each author, respectively.

  2. New parameter was added to cnvKomapre, allowing the user to toggle the appearance of the x-axis labels. This is useful when you want to hide the x-axis labels due to overlaps of the labels. (issue #209)

  3. Added a new parameter to pretty_lollipop_plot that lets the user dictate the name of the file for the exported plot. This seems to be the intended behavior for this function but was never fully implemented. (issue #209)

  4. New safety check was added to pretty_lollipop_plot that returns a useful error message if no genes are provided.

  5. Verbose parameter added to ashm_multi_rainbow plot, preventing the full region_bed file to be printed if set to FALSE. (issue #209)

  6. Updated internal call of, now relocated bundled data sets in ashm_multi_rainbow_plot.

  7. Examples updated for fancy_circos_plot as well as the variable used in returning a message to the user when calling this plotting function.

  8. Vignettes have been updated to allow for a more tidy output.

  9. Logo on the main README has been updated to the correct size (based on best practices for R package logos).

  10. Package documentation has been regenerated.

  11. Adding missing data sets for pretty_lollipop_plot function. The two added datasets seem to be the main reason this function failed for certain GAMBLRs. I do recommend reinstalling the g3viz package and see if this resolves the problem, if not, the added bundled data should allow such users to call this function successfully.

  12. Extensive overhaul of get_manta_sv that allows for using cached results (and compiling the cached results). For more info, see the. updated function documentation.

  13. A new helper function id_ease for dealing with sample IDs and/or metadata. See function docs for more information.

  14. The commented-out code for returning normals (get_gambl_metadata) has been reenabled and the vignettes have been re-knitted to see if the previous problem that called for this code to be commented out was still around. The vignettes knitted just fine (issue #190).

  15. This PR also includes a hot-fix for liftover_bedppe that converts to strings manually (avoid scientific notation in rare cases when R coerces to strings)

  16. Lastly, this PR also resolves mode = "strelka2" support in get_ssm_by_region (issue #202). Note, since this output only has the standard BED columns, streamlined is forced to TRUE for this mode. Streamlined is a Boolean parameter that if set to TRUE, only returns two columns in the MAF (Start_Position and Tumour_Sample_Barcode).

Pull Request Checklists

Important: When opening a pull request, keep only the applicable checklist and delete all other sections.

Checklist for all PRs

Required

This can be checked and addressed by running check_functions.pl and responding to the prompts. Test your code after you do this.

Optional but preferred with PRs

Checklist for New Functions

Required

Example:

#' @title ASHM Rainbow Plot
#'
#' @description Make a rainbow plot of all mutations in a region, ordered and coloured by metadata.
#'
#' @details This function creates a rainbow plot for all mutations in a region. Region can either be specified with the `region` parameter,
#' or the user can provide a maf that has already been subset to the region(s) of interest with `mutation_maf`.
#' As a third alternative, the regions can also be specified as a bed file with `bed`.
#' Lastly, this function has a variety of parameters that can be used to further customize the returned plot in many different ways.
#' Refer to the parameter descriptions, examples as well as the vignettes for more demonstrations how this function can be called.
#'
#' @param mutations_maf A data frame containing mutations (MAF format) within a region of interest (i.e. use the get_ssm_by_region).
#' @param metadata should be a data frame with sample_id as a column.
#' @param exclude_classifications Optional argument for excluding specific classifications from a metadeta file.
#' @param drop_unmutated Boolean argument for removing unmutated sample ids in mutated cases.
#' @param classification_column The name of the metadata column to use for ordering and colouring samples.
#' @param bed Optional data frame specifying the regions to annotate (required columns: start, end, name).
#' @param region Genomic region for plotting in bed format.
#' @param custom_colours Provide named vector (or named list of vectors) containing custom annotation colours if you do not want to use standartized pallette.
#' @param hide_ids Boolean argument, if TRUE, ids will be removed.
#'
#' @return ggplot2 object.
#'
#' @import dplyr ggplot2
#' @export
#'
#' @examples
#' #basic usage
#' region = "chr6:90975034-91066134"
#' metadata = get_gambl_metadata()
#' plot = ashm_rainbow_plot(metadata = metadata, region = region)
#'
#' #advanced usages
#' mybed = data.frame(start = c(128806578,
#'                              128805652,
#'                              128748315),
#'                    end = c(128806992,
#'                            128809822,
#'                            128748880),
#'                    name = c("TSS",
#'                             "enhancer",
#'                             "MYC-e1"))
#'
#' ashm_rainbow_plot(mutations_maf = my_mutations,
#'                   metadata = my_metadata,
#'                   bed = mybed)
#'

Example:

#' @return nothing
#' @export
#' @import tidyverse ggrepel

Checklist for changes to existing code

vladimirsouza commented 1 year ago

As requested by Adam, I checked only function get_manta_sv and it is working well. I'm going to approve this PR.