morinlab / GAMBLR

Set of standardized functions to operate with genomic data
https://morinlab.github.io/GAMBLR/
MIT License
3 stars 2 forks source link

collate_results fix #133

Closed mattssca closed 1 year ago

mattssca commented 1 year ago

Pull Request Checklists

Important: When opening a pull request, keep only the applicable checklist and delete all other sections.

Checklist for all PRs

Required

This can be checked and addressed by running check_functions.pl and responding to the prompts. Test your code after you do this.

Optional but preferred with PRs

Checklist for New Functions

Required

Example:

#' Use GISTIC2.0 scores output to reproduce maftools::chromoplot with more flexibility
#'
#' @param scores output file scores.gistic from the run of GISTIC2.0
#' @param genes_to_label optional. Provide a data frame of genes to label (if mutated). The first 3 columns must contain chromosome, start, and end coordinates. Another required column must contain gene names and be named `gene`. (truncated for example)
#' @param cutoff optional. Used to determine which regions to color as aberrant. Must be float in the range [0-1]. (truncated for example)

Example:

#' @return nothing
#' @export
#' @import tidyverse ggrepel

Checklist for changes to existing code

rdmorin commented 1 year ago

Please don't submit a PR until the problem is fully fixed. Use a draft PR instead if you want feedback on the partial fix. I also would like to see working examples that test both modes of from_cache in the PR.

mattssca commented 1 year ago

I have now marked this PR as a draft, I will update and provide the information you requested once it's ready for review. Thanks!

mattssca commented 1 year ago

This function has now been updated with appropriate file paths for reading and exporting cached results. The function has also been getting a facelift in terms of expanded in-line comments, parameter descriptions and extended examples.

When testing this function I was not allowed to write to file: Error: Cannot open file for writing

However, I tested this function with from_cached = FALSE and write_to_file = FALSE for both genome and capture with success. As such:

genome_collated = collate_results(seq_type_filter = "genome", from_cache = FALSE, write_to_file = FALSE)
capture_collated = collate_results(seq_type_filter = "capture", from_cache = FALSE, write_to_file = FALSE)

I also tested using these_samples_metadata and join_with_full_metadata:

fl_metadata = get_gambl_metadata(seq_type_filter = "genome") %>% dplyr::filter(pathology == "FL")
fl_collated = collate_results(seq_type_filter = "genome", join_with_full_metadata = TRUE, these_samples_metadata = fl_metadata, write_to_file = FALSE, from_cache = FALSE)

Lastly, I tested this function with using the sample_table parameter as well:

fl_samples = get_gambl_metadata(seq_type_filter = "genome") %>% dplyr::filter(pathology == "FL") %>% dplyr::select(sample_id, patient_id, biopsy_id)
fl_collated = collate_results(sample_table = fl_samples, seq_type_filter = "genome", from_cache = FALSE, write_to_file = FALSE)