rlbarter / superheat

An r package for generating beautiful and customizable heatmaps
https://rlbarter.github.io/superheat/
235 stars 29 forks source link

empty plot (background grid only) for high-dim data #34

Closed gunnarschulze closed 6 years ago

gunnarschulze commented 6 years ago

Hi!

First of all thanks for a great package!

I was wondering whether there are inherent limitations on the input matrix dimensions when plotting heatmaps? I have a 400k rows X 7 columns matrix I would like to visualize but plotting this (at least on the standard R-devices) yields an empty plot (gray background grid only, reproducible example below).

Can you offer any advice on parameters (also any preferred devices) for plotting such high-dimensional data?

As a side note: I got similar issues when doing a custom ggplot with geom_tile() and could fix it by setting geom_tile(size=0). However it does not look nearly as neat as supearheat() :)

Thanks a lot!

` library(superheat)

data<- matrix(rnorm(7 4 10^5),ncol=7)

superheat(data) `

My session info: ` R version 3.4.3 (2017-11-30) Platform: x86_64-redhat-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

Matrix products: default BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] bindrcpp_0.2 superheat_0.1.0

loaded via a namespace (and not attached): [1] Rcpp_0.12.15 digest_0.6.15 assertthat_0.2.0 dplyr_0.7.4 grid_3.4.3 plyr_1.8.4
[7] R6_2.2.2 gtable_0.2.0 magrittr_1.5 scales_0.5.0 ggplot2_2.2.1 pillar_1.2.1
[13] rlang_0.2.0 lazyeval_0.2.1 labeling_0.3 tools_3.4.3 glue_1.2.0 munsell_0.4.3
[19] compiler_3.4.3 pkgconfig_2.0.1 colorspace_1.3-2 bindr_0.1 tibble_1.4.2

`

rlbarter commented 6 years ago

Hi!

Unfortunately superheat is built ontop of ggplot functions such as geom_tile/geom_raster, so the limitations in terms of dimensionality will be the same for both superheat and ggplot2. I have two suggestions:

  1. use clustering to produce a smoothed heatmap (https://rlbarter.github.io/superheat/smoothing-in-high-dimensions.html)

  2. plot only a subset of the data.

Keep in mind that there are only so many pixels on a screen, so it may not be a good idea to try to plot such a large matrix in its raw form anyway... though I understand the desire to do so.

gunnarschulze commented 6 years ago

Thanks for the quick reply!

I had my suspicions regarding the dimensionality (and questioned the sanity of this :)) The smoothing option seems to be a nice alternative - I will try it out!