YoannPa / methview.qc

HM450.QCView allows you to generate quality control plots from your Human Methylation 450K dataset.
GNU General Public License v3.0
4 stars 0 forks source link

plot.all.qc(include.gp = TRUE) - display issue with legends and sample labels. #6

Closed AnjaCaRa closed 3 years ago

AnjaCaRa commented 3 years ago

Hi Yoann, currently the legend of the snp heatmap does not completely fit into the image of the plot, neither pdf nor png. Also the sample labels are overlapping and are not readable anymore. Do you think you could possibly fix this? Thank you!

YoannPa commented 3 years ago

Hi Anja,

Could you describe again the issue ? Uploading the png plot you obtained would help.

Best, Yoann.

AnjaCaRa commented 3 years ago

Hi Yoann, here a more detailed description of the problem.

Session information

sessionInfo() R version 4.0.0 (2020-04-24) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

Matrix products: default BLAS: /usr/lib64/libblas.so.3.4.2 LAPACK: /usr/lib64/liblapack.so.3.4.2

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] grid stats4 parallel stats graphics grDevices utils datasets methods base

other attached packages: [1] readr_2.0.1 RnBeads.hg19_1.20.0 colorspace_2.0-2
[4] ggsci_2.9 methview.qc_0.0.17 BiocompR_0.0.128
[7] data.table_1.14.0 RnBeads_2.6.0 plyr_1.8.6
[10] methylumi_2.34.0 minfi_1.34.0 bumphunter_1.30.0
[13] locfit_1.5-9.4 iterators_1.0.13 foreach_1.5.1
[16] Biostrings_2.56.0 XVector_0.28.0 SummarizedExperiment_1.18.2
[19] DelayedArray_0.14.1 FDb.InfiniumMethylation.hg19_2.2.0 org.Hs.eg.db_3.11.4
[22] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 GenomicFeatures_1.40.1 AnnotationDbi_1.50.3
[25] reshape2_1.4.4 scales_1.1.1 Biobase_2.48.0
[28] illuminaio_0.30.0 matrixStats_0.60.0 limma_3.44.3
[31] gridExtra_2.3 gplots_3.1.1 ggplot2_3.3.5
[34] fields_12.5 viridis_0.6.1 viridisLite_0.4.0
[37] spam_2.7-0 dotCall64_1.0-1 ff_4.0.4
[40] bit_4.0.4 cluster_2.1.0 MASS_7.3-51.5
[43] GenomicRanges_1.40.0 GenomeInfoDb_1.24.2 IRanges_2.22.2
[46] S4Vectors_0.26.1 BiocGenerics_0.34.0

loaded via a namespace (and not attached): [1] parallelDist_0.2.4 BiocFileCache_1.12.1 splines_4.0.0 BiocParallel_1.22.0 digest_0.6.27
[6] fansi_0.5.0 magrittr_2.0.1 memoise_2.0.0 tzdb_0.1.2 fastcluster_1.2.3
[11] annotate_1.66.0 RcppParallel_5.1.4 askpass_1.1 siggenes_1.62.0 prettyunits_1.1.1
[16] blob_1.2.2 rappdirs_0.3.3 ggrepel_0.9.1 dplyr_1.0.7 crayon_1.4.1
[21] RCurl_1.98-1.3 genefilter_1.70.0 GEOquery_2.56.0 survival_3.1-12 glue_1.4.2
[26] gtable_0.3.0 zlibbioc_1.34.0 Rhdf5lib_1.10.1 maps_3.3.0 HDF5Array_1.16.1
[31] DBI_1.1.1 rngtools_1.5 Rcpp_1.0.7 xtable_1.8-4 progress_1.2.2
[36] mclust_5.4.7 preprocessCore_1.50.0 httr_1.4.2 RColorBrewer_1.1-2 ellipsis_0.3.2
[41] farver_2.1.0 pkgconfig_2.0.3 reshape_0.8.8 XML_3.99-0.6 dbplyr_2.1.1
[46] utf8_1.2.2 labeling_0.4.2 tidyselect_1.1.1 rlang_0.4.11 munsell_0.5.0
[51] tools_4.0.0 cachem_1.0.5 cli_3.0.1 generics_0.1.0 RSQLite_2.2.7
[56] ggdendro_0.1.22 stringr_1.4.0 fastmap_1.1.0 bit64_4.0.5 beanplot_1.2
[61] caTools_1.18.2 scrime_1.3.5 purrr_0.3.4 nlme_3.1-147 doRNG_1.8.2
[66] nor1mix_1.3-0 xml2_1.3.2 biomaRt_2.44.4 compiler_4.0.0 rstudioapi_0.13
[71] curl_4.3.2 tibble_3.1.3 stringi_1.7.3 lattice_0.20-41 Matrix_1.2-18
[76] multtest_2.44.0 vctrs_0.3.8 pillar_1.6.2 lifecycle_1.0.0 bitops_1.0-7
[81] rtracklayer_1.48.0 R6_2.5.0 KernSmooth_2.23-16 codetools_0.2-16 gtools_3.9.2
[86] assertthat_0.2.1 rhdf5_2.32.4 openssl_1.4.4 withr_2.4.2 GenomicAlignments_1.24.0 [91] Rsamtools_2.4.0 GenomeInfoDbData_1.2.3 hms_1.1.0 quadprog_1.5-8 tidyr_1.1.3
[96] base64_2.0 DelayedMatrixStats_1.10.1

Bug description So what I am trying to report is, that the patient IDs on the x-axis are squished together so you cannot read them anymore and depending on how many group categories you have, the group legend to the right does not fit within the margins of the plot anymore.

Expected behaviour /possbile fix Increasing the width of the plotting area as well as the margin to the right could fix both problems.

To reproduce `#!/usr/bin/env Rscript

_Quality control methview.qc

Version = '0.0.1' Date = '2021-08-17' Author = 'Anja Rathgeber' Maintainer = 'Anja Rathgeber (anja.rathgeber@mail.de)' Dependencies = c('R version 4.0.0 (2021-03-31)', 'RStudio Version 1.4.1106-5 - © 2009-2021','Rnbeads', 'methview.qc', 'ggsci', 'colorspace') Description = 'Plot enhanced quality control graphs for methylation data' ################################################################################

IMPORTS

setwd("/omics/groups/OE0436/data/rathgeber") Imports = c("methview.qc", "RnBeads","ggsci", "colorspace") invisible(lapply(Imports, library, character.only = T))

PARAMETERS

data.dir <- "/omics/groups/OE0436/data/rathgeber/data/neuroblastoma/RnBeads" idat.dir <- file.path(data.dir, "idat") sample.annotation <- file.path(data.dir, "sample_annotation_renamed.csv") analysis.dir <- "omics/groups/OE0436/data/rathgeber/output/neuroblastoma/RnBeads" rnb.options(identifiers.column = "PID", disk.dump.big.matrices = FALSE) # disk dump takes care of not deleting data once you reload a rnbeads dataset using load.rnb.set

ANALYSIS

manually import data

data.source <- c(idat.dir, sample.annotation) result <- rnb.execute.import(data.source = data.source , data.type = "infinium.idat.dir")

Loads qc data as data.table

metharrayQC <- load.metharray.QC.meta("controls450")

Draws and saves all quality control plots available in methview.qc

plot.all.qc(RnBSet = result, save.dir = "/omics/groups/OE0436/data/rathgeber/output/neuroblastoma/plots/methview.qc", ncores = 4, include.gp = TRUE) ` Current output As the data is not anonymized yet I will not be able to publicly post the plots. /omics/groups/OE0436/data/rathgeber/output/neuroblastoma/plots/methview.qc/Heatmap_genotyping_probes_HM450K.png /omics/groups/OE0436/data/rathgeber/output/neuroblastoma/plots/methview.qc/Heatmap_genotyping_probes_HM450K.pdf

Let me know if you require additional information!

YoannPa commented 3 years ago

I have relabeled the issue, because this is not a bug. there are 5 different options you can play with in order to fix your display issue on the genotyping probes heatmap. Use the function snp.heatmap() to customize yourself the genotyping probe heatmap outside of the plot.all.QC() function which provides an automatic, non-custom heatmap. the latter will certainly satisfy the needs of most datasets... but for the biggest one, you should use snp.heatmap().

One feature request could be that the legend space on the right side could increase automatically with the size of legends stored. I will try to adress this issue but cannot guaranty its feasability. I will let you know !

AnjaCaRa commented 3 years ago

I think what we were originally talking about is that once you actually use the fully automated plot.all.qc() function, you did not want the user to have to adapt any parameters manually. I am aware of the other options for the manual generation of the snp.heatmap() :)

YoannPa commented 3 years ago

Yes exactly that's the point: automatic also means that it should be the simplest result producible here. And again, the simplest will fit most datasets, which are not necessarily as big as yours. Yet I can try to fix the legend space to make it more flexible, and avoid the legends to be cut out of the plot.

AnjaCaRa commented 3 years ago

Ah okay, I left out of sight that my data set was again uncommonly large, I assumed this might also occur with smaller data sets, but if not then thats all perfect :) I guess this is then not an issue anymore

YoannPa commented 3 years ago

The legend issue will happen with smaller dataset I guess (I will try out some I have hands on to confirm this).

YoannPa commented 3 years ago

Actually, I think this feature may take too much time to develop. So, by default, when executing plot.all.qc(), snp.heatmap() will not show any annotation sidebar at the top of the plot. The latest commit to methview.qc already contain this option, so the next commit will set show.annot option to FALSE for snp.heatmap() in plot.all.qc().