morinlab / GAMBLR

Set of standardized functions to operate with genomic data
https://morinlab.github.io/GAMBLR/
MIT License
3 stars 2 forks source link

`collate_results` has unexpected behaviour on join_with_full_metadata #112

Closed lkhilton closed 1 year ago

lkhilton commented 2 years ago

The documentation for this function implies that the join_with_full_metadata argument can be toggled to join with all metadata columns. However the function joins with all metadata columns from get_gambl_metadata() instead of from the user-supplied metadata table, so setting join_with_full_metadata = TRUE when running this function ignores the user-supplied metadata table.

if (join_with_full_metadata) {
    # INSERT ANOTHER IF STATEMENT HERE TO USE THE USER_PROVIDED METADATA TABLE IF IT EXISTS
    full_meta = get_gambl_metadata(seq_type_filter = seq_type_filter)
    full_table = left_join(full_meta, sample_table)
    full_table = full_table %>% mutate(MYC_SV_any = case_when(ashm_MYC > 
      3 ~ "POS", manta_MYC_sv == "POS" ~ "POS", ICGC_MYC_sv == 
      "POS" ~ "POS", myc_ba == "POS" ~ "POS", TRUE ~ "NEG"))
    full_table = full_table %>% mutate(BCL2_SV_any = case_when(ashm_BCL2 > 
      3 ~ "POS", manta_BCL2_sv == "POS" ~ "POS", ICGC_BCL2_sv == 
      "POS" ~ "POS", bcl2_ba == "POS" ~ "POS", TRUE ~ 
      "NEG"))
    full_table = full_table %>% mutate(DoubleHitBCL2 = ifelse(BCL2_SV_any == 
      "POS" & MYC_SV_any == "POS", "Yes", "No"))
    return(full_table)
  }