Open juferban opened 1 month ago
hi @juferban, could you post some of your data which can lead to the bug to make us easy to Reproduce the bug?
Hi, Thanks for your quick response. I will generate a couple of files and will upload them so you can use them for testing. I will upload them soon.
Thanks a lot.
Hi @biosunsci
I am attaching the example files to be able to reproduce my problem. Also the code Is used for testing is as follow:
## Load MAF files
maf_object = read.maf(maf = "mutations_filtered.maf",
clinicalData = "sample_annot_for_maf.txt", isTCGA = FALSE)
## Make sure the continuous variables are shows as continuous
maf_object@clinical.data$Response = as.numeric(maf_object@clinical.data$Response)
maf_object@clinical.data$Volume_Change = as.numeric(maf_object@clinical.data$Volume_Change)
maf_object@clinical.data$Treatment_Duration = as.numeric(maf_object@clinical.data$Treatment_Duration)
# Sort the clinical data by multiple variables, as I want to make sure I use my predefined sample sorting
sorted_clinical_data <- maf_object@clinical.data[order(
maf_object@clinical.data$Gender,
maf_object@clinical.data$Treatment_Group,
# Handle NAs: NA values are set to 1000 so they appear first
dplyr::desc(ifelse(is.na(maf_object@clinical.data$Response ), 1000, as.numeric(maf_object@clinical.data$Response ))),
# Handle NAs: NA values are set to 1000 so they appear first
dplyr::desc(ifelse(is.na(maf_object@clinical.data$Volume_Change ), 1000, as.numeric(maf_object@clinical.data$Volume_Change ))),
# Handle NAs: NA values are set to 1000 so they appear first
ifelse(is.na(maf_object@clinical.data$Treatment_Duration), 1000, as.numeric(maf_object@clinical.data$Treatment_Duration))
), ]
maf_object@clinical.data <- sorted_clinical_data
# Extract the sorted sample names
sorted_samples <- sorted_clinical_data$Tumor_Sample_Barcode
## Create the oncoplot
oncoplot(maf = maf_object,
removeNonMutated = FALSE,
fill = TRUE,
clinicalFeatures = c('Gender','Treatment_Group','Response','Volume_Change','Treatment_Duration'),
sampleOrder = sorted_samples,
showTitle = TRUE,
titleFontSize = 1.5,
legendFontSize = 1,
annotationFontSize = 1,
SampleNamefontSize = 0.5,
fontSize = 0.7,
showTumorSampleBarcodes = TRUE,
barcode_mar = 3,
gene_mar = 5,
legend_height = 4,
anno_height = 1.5,
annoBorderCol = "white",
drawRowBar = TRUE,
genesToIgnore = 'KRAS',
numericAnnoCol = TRUE,
showPct = TRUE,
rightBarLims = c(0, 100),
leftBarLims = c(0, 100),
)
If I only use the clinical variables 'Gender', 'Treatment_Group' and 'Response' with Response being the only continuous variable, the coloring is correctly applied. As soon as I incorporate the other two continuous variables the coloring gets mixed up.
Thanks a lot,
This is my session info:
R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.4 LTS
Matrix products: default
BLAS: /mnt/disks/monoceros_nfs/software/R-4.3.2/lib/R/lib/libRblas.so
LAPACK: /mnt/disks/monoceros_nfs/software/R-4.3.2/lib/R/lib/libRlapack.so; LAPACK version 3.11.0
locale:
[1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
[4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
[7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
time zone: Etc/UTC
tzcode source: system (glibc)
attached base packages:
[1] grid stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] pROC_1.18.5 trackViewer_1.38.2 GenomicRanges_1.54.1
[4] GenomeInfoDb_1.38.8 IRanges_2.36.0 S4Vectors_0.40.2
[7] BiocGenerics_0.48.1 maftools_2.18.0 pheatmap_1.0.12
[10] survminer_0.4.9 ggpubr_0.6.0 survival_3.5-8
[13] RColorBrewer_1.1-3 statmod_1.5.0 ggrepel_0.9.5
[16] edgeR_4.0.16 limma_3.58.1 reshape2_1.4.4
[19] openxlsx_4.2.5.2 lubridate_1.9.3 forcats_1.0.0
[22] stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2
[25] readr_2.1.5 tidyr_1.3.1 tibble_3.2.1
[28] ggplot2_3.5.0 tidyverse_2.0.0 data.table_1.15.4
[31] optparse_1.7.5 monoceRos_1.0.5
loaded via a namespace (and not attached):
[1] splines_4.3.2 pbdZMQ_0.3-11
[3] BiocIO_1.12.0 bitops_1.0-7
[5] filelock_1.0.3 graph_1.80.0
[7] XML_3.99-0.16.1 rpart_4.1.23
[9] lifecycle_1.0.4 rstatix_0.7.2
[11] ensembldb_2.26.0 lattice_0.22-6
[13] backports_1.4.1 magrittr_2.0.3
[15] Hmisc_5.1-2 rmarkdown_2.26
[17] plotrix_3.8-4 yaml_2.3.8
[19] zip_2.3.1 Gviz_1.46.1
[21] DBI_1.2.2 abind_1.4-5
[23] zlibbioc_1.48.2 AnnotationFilter_1.26.0
[25] biovizBase_1.50.0 RCurl_1.98-1.14
[27] nnet_7.3-19 VariantAnnotation_1.48.1
[29] rappdirs_0.3.3 GenomeInfoDbData_1.2.11
[31] KMsurv_0.1-5 grImport_0.9-7
[33] codetools_0.2-20 getopt_1.20.4
[35] DelayedArray_0.28.0 xml2_1.3.6
[37] DNAcopy_1.76.0 tidyselect_1.2.1
[39] matrixStats_1.3.0 BiocFileCache_2.10.2
[41] base64enc_0.1-3 GenomicAlignments_1.38.2
[43] jsonlite_1.8.8 Formula_1.2-5
[45] tools_4.3.2 progress_1.2.3
[47] strawr_0.0.91 Rcpp_1.0.13
[49] glue_1.7.0 gridExtra_2.3
[51] SparseArray_1.2.4 xfun_0.43
[53] MatrixGenerics_1.14.0 IRdisplay_1.1
[55] withr_3.0.0 fastmap_1.1.1
[57] rhdf5filters_1.14.1 latticeExtra_0.6-30
[59] fansi_1.0.6 digest_0.6.35
[61] timechange_0.3.0 R6_2.5.1
[63] colorspace_2.1-0 Cairo_1.6-2
[65] jpeg_0.1-10 dichromat_2.0-0.1
[67] biomaRt_2.58.2 RSQLite_2.3.6
[69] utf8_1.2.4 generics_0.1.3
[71] rtracklayer_1.62.0 InteractionSet_1.30.0
[73] prettyunits_1.2.0 httr_1.4.7
[75] htmlwidgets_1.6.4 S4Arrays_1.2.1
[77] pkgconfig_2.0.3 gtable_0.3.4
[79] blob_1.2.4 XVector_0.42.0
[81] survMisc_0.5.6 htmltools_0.5.8.1
[83] carData_3.0-5 ProtGenerics_1.34.0
[85] scales_1.3.0 Biobase_2.62.0
[87] png_0.1-8 knitr_1.46
[89] km.ci_0.5-6 rstudioapi_0.16.0
[91] tzdb_0.4.0 rjson_0.2.21
[93] uuid_1.2-0 checkmate_2.3.1
[95] curl_5.2.1 rhdf5_2.46.1
[97] repr_1.1.7 cachem_1.0.8
[99] zoo_1.8-12 parallel_4.3.2
[101] foreign_0.8-86 AnnotationDbi_1.64.1
[103] restfulr_0.0.15 pillar_1.9.0
[105] vctrs_0.6.5 car_3.1-2
[107] dbplyr_2.5.0 xtable_1.8-4
[109] cluster_2.1.6 htmlTable_2.4.2
[111] Rgraphviz_2.46.0 evaluate_0.23
[113] GenomicFeatures_1.54.4 cli_3.6.2
[115] locfit_1.5-9.9 compiler_4.3.2
[117] Rsamtools_2.18.0 rlang_1.1.3
[119] crayon_1.5.2 ggsignif_0.6.4
[121] interp_1.1-6 getPass_0.2-4
[123] plyr_1.8.9 stringi_1.8.3
[125] deldir_2.0-4 BiocParallel_1.36.0
[127] munsell_0.5.1 Biostrings_2.70.3
[129] lazyeval_0.2.2 Matrix_1.6-5
[131] IRkernel_1.3.2 BSgenome_1.70.2
[133] hms_1.1.3 bit64_4.0.5
[135] Rhdf5lib_1.24.2 KEGGREST_1.42.0
[137] SummarizedExperiment_1.32.0 broom_1.0.5
[139] memoise_2.0.1 bit_4.0.5
Hi,
Thank you for the files. I have fixed the issue. You should be able to define your own color codes for each continuoius variable as well.
Just mention any of the sequetial color codes from RcolorBrewer
package and it should do the trick.
oncoplot(
maf = maf_object,
removeNonMutated = FALSE,
fill = TRUE,
clinicalFeatures = c('Treatment_Duration', 'Treatment_Group', 'Response', 'Volume_Change', 'Gender'),
sortByAnnotation = T,
anno_height = 3,
annotationColor = list(Gender = c("M" = "black", 'F' = "pink"),
Treatment_Group = c("Treatment1" = "royalblue", "Treatment2" = "maroon"),
Treatment_Duration = "Blues", Response = "Reds",Volume_Change = "Purples"),
annoBorderCol = 'black')
If not provided, it will randomly select from the available pallets.
Please let me know if this fixes the issue.
Thanks a lot for the quick fix. Really appreciate it.
I will give it I try on my analysis and will report back if still having any issues.
Thanks again,
Julio
I had similar issues. I think the issue happened when sampleOrder is applied, then the continuous clinical feature did not match the ordered samples.
Hi @Zhongqige ,
This is fixed in the recent commit. Could you please try a fresh installation from GitHub and let me know if it works?
BiocManager::install("PoisonAlien/maftools")
tcga_test_w_sampleOrder.pdf tcga_test_wo_sampleOrder.pdf Hi, Thanks for the quick response! However, I just tested, using @biosunsci tcga data, and attached result with and without the parameter sampleOrder = sorted_samples, seems still the same sample got different Response value.
Hi @Zhongqige ,
I have trouble reproducing the issue. The function respects the sample order and the corresponding variables. Could you maybe post the complete set of commands that you used? Please make sure that you have updated the package from GitHub and restarted your R session to make changes.
@PoisonAlien I did install the latest version 2.21.1 and restarted my R session, and below is my command (Basically using @juferban):
## Load MAF files
maf_object = read.maf(maf = "./oncoplots_examples/mutations_filtered.maf",
clinicalData = "./oncoplots_examples/sample_annot_for_maf.txt", isTCGA = FALSE)
## Make sure the continuous variables are shows as continuous
maf_object@clinical.data$Response = as.numeric(maf_object@clinical.data$Response)
maf_object@clinical.data$Volume_Change = as.numeric(maf_object@clinical.data$Volume_Change)
maf_object@clinical.data$Treatment_Duration = as.numeric(maf_object@clinical.data$Treatment_Duration)
# Sort the clinical data by multiple variables, as I want to make sure I use my predefined sample sorting
sorted_clinical_data <- maf_object@clinical.data[order(
maf_object@clinical.data$Gender,
maf_object@clinical.data$Treatment_Group,
# Handle NAs: NA values are set to 1000 so they appear first
dplyr::desc(ifelse(is.na(maf_object@clinical.data$Response ), 1000, as.numeric(maf_object@clinical.data$Response ))),
# Handle NAs: NA values are set to 1000 so they appear first
dplyr::desc(ifelse(is.na(maf_object@clinical.data$Volume_Change ), 1000, as.numeric(maf_object@clinical.data$Volume_Change ))),
# Handle NAs: NA values are set to 1000 so they appear first
ifelse(is.na(maf_object@clinical.data$Treatment_Duration), 1000, as.numeric(maf_object@clinical.data$Treatment_Duration))
), ]
maf_object@clinical.data <- sorted_clinical_data
# Extract the sorted sample names
sorted_samples <- sorted_clinical_data$Tumor_Sample_Barcode
pdf("./tcga_test_wo_sampleOrder.pdf", 12, 8)
oncoplot(maf = maf_object,
removeNonMutated = FALSE,
fill = TRUE,
clinicalFeatures = c('Gender','Treatment_Group','Response'), #, 'Volume_Change','Treatment_Duration'
#sampleOrder = sorted_samples,
annotationColor = list(Gender = c("F" = "deeppink", "M" = "dodgerblue"),
Treatment_Group = c("Treatment1" = "salmon", "Treatment2" = "yellowgreen"),
Response = "Blues"
),
showTitle = TRUE,
titleFontSize = 1.5,
legendFontSize = 1,
annotationFontSize = 1,
SampleNamefontSize = 0.5,
fontSize = 0.7,
showTumorSampleBarcodes = TRUE,
barcode_mar = 3,
gene_mar = 5,
legend_height = 4,
anno_height = 1.5,
annoBorderCol = "white",
drawRowBar = TRUE,
genesToIgnore = 'KRAS',
numericAnnoCol = TRUE,
showPct = TRUE,
rightBarLims = c(0, 100),
leftBarLims = c(0, 100),
)
dev.off()
pdf("./tcga_test_w_sampleOrder.pdf", 12, 8)
oncoplot(maf = maf_object,
removeNonMutated = FALSE,
fill = TRUE,
clinicalFeatures = c('Gender','Treatment_Group','Response'), #, 'Volume_Change','Treatment_Duration'
sampleOrder = sorted_samples,
annotationColor = list(Gender = c("F" = "deeppink", "M" = "dodgerblue"),
Treatment_Group = c("Treatment1" = "salmon", "Treatment2" = "yellowgreen"),
Response = "Blues"
),
showTitle = TRUE,
titleFontSize = 1.5,
legendFontSize = 1,
annotationFontSize = 1,
SampleNamefontSize = 0.5,
fontSize = 0.7,
showTumorSampleBarcodes = TRUE,
barcode_mar = 3,
gene_mar = 5,
legend_height = 4,
anno_height = 1.5,
annoBorderCol = "white",
drawRowBar = TRUE,
genesToIgnore = 'KRAS',
numericAnnoCol = TRUE,
showPct = TRUE,
rightBarLims = c(0, 100),
leftBarLims = c(0, 100),
)
dev.off()
> sessionInfo()
R version 4.2.2 (2022-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.7
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] maftools_2.21.1
loaded via a namespace (and not attached):
[1] DNAcopy_1.72.3 rstudioapi_0.14 magrittr_2.0.3 splines_4.2.2 tidyselect_1.2.0
[6] lattice_0.20-45 R6_2.5.1 rlang_1.0.6 fansi_1.0.4 dplyr_1.1.0
[11] tools_4.2.2 grid_4.2.2 pkgbuild_1.4.0 data.table_1.14.6 utf8_1.2.3
[16] cli_3.6.0 withr_2.5.0 remotes_2.5.0 survival_3.4-0 rprojroot_2.0.3
[21] tibble_3.1.8 lifecycle_1.0.3 crayon_1.5.2 Matrix_1.5-3 processx_3.8.0
[26] BiocManager_1.30.19 RColorBrewer_1.1-3 callr_3.7.3 vctrs_0.5.2 ps_1.7.2
[31] curl_5.0.0 glue_1.6.2 compiler_4.2.2 pillar_1.8.1 desc_1.4.2
[36] generics_0.1.3 prettyunits_1.1.1 pkgconfig_2.0.3
@PoisonAlien
Hi, Sorry for my delay with additional testing. I am having the same issue as reported by @Zhongqige when testing the code after the update using the BiocManager::install("PoisonAlien/maftools"). The samples are still getting the colors assigned in a somehow random way even though the order is correct.
Hello all!
Sorry for the delay. It took a while to figure out the issue. It turns out that just the colors were flipped. I have fixed it. Please install it from GitHub for changes.
Describe the issue Hello,
I am having an issue trying to add continuous clinical features to my oncoplot.
If I add more than one clinicalFeature that has continuous values the colors applied to the values seem to be mixed and not match the values they are supposed to represent.
More specifically, I had an oncoplot where I wanted to add to clinical features that represent to different way to measure response. If I only add one of the features to the plot, the color gradient applies correctly but if I add both clinical Features to the plots, most samples show the correct colors but random samples show colors that don't match. In my test, I specified the sample order using the sampleOrder variable in the oncoplot command and the sample order corresponded to the first clinical feature so the gradient should show from lowest to highest (which correctly does when only adding that first clinical Feature to the oncoplot). As soon as I add the second clinical feature some samples get a random color assigned.
The command do not throw any error.
Thanks for a great package!.
Command
Session info