raivokolde / pheatmap

Pretty heatmaps
225 stars 83 forks source link

Unwanted behavior (BUG?) for `drop_levels=TRUE` when give `annotation_colors` #69

Open epurdom opened 4 years ago

epurdom commented 4 years ago

I just discovered in using pheatmap that if I give a list of color assignments via the argument annotation_colors, and there are extra colors, these colors are given in the annotation legend regardless of whether drop_levels=TRUE or FALSE.

Here's an example setup:

x<-matrix(rnorm(100),nrow=5)
colAnnotation<-data.frame(
    factor1=factor(rep(c(1,2,3,4),length=ncol(x))),
    factor2=factor(rep(c(1,2,3),length=ncol(x)))
)
rownames(colAnnotation)<-colnames(x)<-as.character(1:ncol(x))
colorLegend<-list(factor1=c("1"="red","2"="green","3"="blue","4"="black"),
factor2=c("1"="red","2"="green","3"="blue","4"="black"))

If I don't give colors to pheatmap, it acts as I expect; namely, it only shows in the annotation legend the colors that are actually used in the data, including if I subset to a portion of the data (with drop_levels=TRUE by default):

pheatmap(x,annotation_col=colAnnotation)
wh<-colAnnotation$factor1=="1"
pheatmap(x[,wh],annotation_col=colAnnotation[wh,],drop_levels=TRUE)

However, if I give the annotation_colors as my object colorLegend above (which gives colors to values that are not one of the factors in my colAnnotation), it shows all of the colors in my colorLegend, regardless of the values of drop_levels:

pheatmap(x,annotation_col=colAnnotation,annotation_colors=colorLegend, drop_levels=TRUE)
pheatmap(x,annotation_col=colAnnotation,annotation_colors=colorLegend, drop_levels=FALSE)

I'm not sure if this is intended or a bug, but it is particularly annoying in practice because if you want to subset your data, like in my example above, it's quite annoying to subset the more complicated list structure that is given to annotation_colors. You have to figure out what (if any) levels have been lost by the subsetting for each of the columns of annotation_col and then remove them from each element of the list. Moreover, it seems like that is the purpose of drop_levels, so for me it seems like a bug

Edited to add specs of my computer etc:

R version 4.0.0 (2020-04-24)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS High Sierra 10.13.6
....

And package version:

> packageVersion("pheatmap")
[1] ‘1.0.12’