talgalili / gplots

12 stars 7 forks source link

Reordering by custom dendrogram no longer working with heatmap.2 #8

Open dr-joe-wirth opened 2 years ago

dr-joe-wirth commented 2 years ago

I have included my code below. It defines a function capable of taking in a tree in newick format alongside a matrix of values, and then plotting a heatmap with the tree on the rows and the columns.

The issue appears to be with heatmap.2

If I plot the dendrogram by itself, it is in the correct order.

Within the call to heatmap.2 ... If I change Rowv=tree to Rowv=FALSE and Colv=tree to Colv=FALSE then I get the correct order.

The code below used to work just fine and produce the graphics I desired. However, it plots the correct dendrogram but it is reordering the matrix which makes the dendrogram no longer correspond with the data.

This last worked in mid 2020. I'm not sure what happened since then to cause it to break.

I'd be happy to provide more information and/or files. This is my first issue report. Please let me know if you'd like additional details.

meanSquareMatrix <- function(squareMat){
    # for the first row to the second-to-last row
    for(i in 1:(ncol(squareMat)-1)){

        # for the i+1th column to the last column
        for(j in (i+1):ncol(squareMat)){
            # extract the two values for the forward and reverse comparison
            valuesV <- c(squareMat[i,j], squareMat[j,i])

            # get the mean of the values and replace the original data
            squareMat[i,j] <- squareMat[j,i] <- mean(valuesV)
        }
    }

    return(squareMat)
}

generateHeatmap <- function(treeFN=NULL, aaiFN=NULL, pdfOutFN=NULL, numDecimals=0, numColors=16, height=32, width=32){
    # dependencies
    require(ape)
    require(DECIPHER)
    require(gplots)
    require(dendextend)

    # import files
    tree <- ReadDendrogram(treeFN, internalLabels=FALSE)
    aai.df <- read.delim(aaiFN)

    # convert all underscores to spaces in aai.df
    rownames(aai.df) <- gsub("_", " ", rownames(aai.df))
    colnames(aai.df) <- gsub("_", " ", colnames(aai.df))

    # remove double spaces from aai.df and tip names
    rownames(aai.df) <- gsub("  ", " ", rownames(aai.df))
    colnames(aai.df) <- gsub("  ", " ", colnames(aai.df))
    labels(tree) <- gsub("  ", " ", labels(tree))

    # make an AAI matrix with only the taxa present in the tree
    aai.mx <- as.matrix(aai.df[which(rownames(aai.df) %in% labels(tree)), which(colnames(aai.df) %in% labels(tree))])

    # order the matrix so that the rows and columns match the order of the tips in the tree
    aai.mx <- aai.mx[order(match(rownames(aai.mx), labels(tree))), order(match(colnames(aai.mx), labels(tree)))]

    # get the mean of all forward/reverse comparisons
    aai.mx <- meanSquareMatrix(aai.mx)

    # remove all self-self comparisons from the table
    for(i in 1:nrow(aai.mx)){
        aai.mx[i,i] <- NA
    }

    # get the cell values for the matrix (rounded)
    aai.cells <- round(aai.mx, digits=numDecimals)

    # get the heat map colors
    colors <- colorRampPalette(colors=c("purple", "red", "yellow", "white"))(numColors)

    # generate the plot and write to file
    pdf(file=pdfOutFN, height=height, width=width)
    heatmap.2(aai.mx, Rowv=tree, Colv=tree, col=colors, cellnote=aai.cells, trace="none", notecol="black", notecex=1, margins=c(12,20), cexRow=1, cexCol=1, lhei=c(1,8), lwid=c(1,8))
    dev.off()

    # note:
    # Rowv=FALSE and Colv=FALSE resolves the ordering problem, but then there is no accompanying dendrogram.
}
talgalili commented 2 years ago

Could you please supply a more minimal example of the bug E.g. using mtcars ?!

On Thu, 11 Nov 2021, 2:32 Joe Wirth, @.***> wrote:

I have included my code below. It defines a function capable of taking in a tree in newick format alongside a matrix of values, and then plotting a heatmap with the tree on the rows and the columns.

The issue appears to be with heatmap.2

If I plot the dendrogram by itself, it is in the correct order.

Within the call to heatmap.2 ... If I change Rowv=tree to Rowv=FALSE and Colv=tree to Colv=FALSE then I get the correct order.

The code below used to work just fine and produce the graphics I desired. However, it plots the correct dendrogram but it is reordering the matrix which makes the dendrogram no longer correspond with the data.

This last worked in mid 2020. I'm not sure what happened since then to cause it to break.

I'd be happy to provide more information and/or files. This is my first issue report. Please let me know if you'd like additional details.

meanSquareMatrix <- function(squareMat){

for the first row to the second-to-last row

for(i in 1:(ncol(squareMat)-1)){

  # for the i+1th column to the last column
  for(j in (i+1):ncol(squareMat)){
      # extract the two values for the forward and reverse comparison
      valuesV <- c(squareMat[i,j], squareMat[j,i])

      # get the mean of the values and replace the original data
      squareMat[i,j] <- squareMat[j,i] <- mean(valuesV)
  }

}

return(squareMat) }

generateHeatmap <- function(treeFN=NULL, aaiFN=NULL, pdfOutFN=NULL, numDecimals=0, numColors=16, height=32, width=32){

dependencies

require(ape) require(DECIPHER) require(gplots) require(dendextend)

import files

tree <- ReadDendrogram(treeFN, internalLabels=FALSE) aai.df <- read.delim(aaiFN)

convert all underscores to spaces in aai.df

rownames(aai.df) <- gsub("", " ", rownames(aai.df)) colnames(aai.df) <- gsub("", " ", colnames(aai.df))

remove double spaces from aai.df and tip names

rownames(aai.df) <- gsub(" ", " ", rownames(aai.df)) colnames(aai.df) <- gsub(" ", " ", colnames(aai.df)) labels(tree) <- gsub(" ", " ", labels(tree))

make an AAI matrix with only the taxa present in the tree

aai.mx <- as.matrix(aai.df[which(rownames(aai.df) %in% labels(tree)), which(colnames(aai.df) %in% labels(tree))])

order the matrix so that the rows and columns match the order of the tips in the tree

aai.mx <- aai.mx[order(match(rownames(aai.mx), labels(tree))), order(match(colnames(aai.mx), labels(tree)))]

get the mean of all forward/reverse comparisons

aai.mx <- meanSquareMatrix(aai.mx)

remove all self-self comparisons from the table

for(i in 1:nrow(aai.mx)){ aai.mx[i,i] <- NA }

get the cell values for the matrix (rounded)

aai.cells <- round(aai.mx, digits=numDecimals)

get the heat map colors

colors <- colorRampPalette(colors=c("purple", "red", "yellow", "white"))(numColors)

generate the plot and write to file

pdf(file=pdfOutFN, height=height, width=width) heatmap.2(aai.mx, Rowv=tree, Colv=tree, col=colors, cellnote=aai.cells, trace="none", notecol="black", notecex=1, margins=c(12,20), cexRow=1, cexCol=1, lhei=c(1,8), lwid=c(1,8)) dev.off()

note:

Rowv=FALSE and Colv=FALSE resolves the ordering problem, but then there is no accompanying dendrogram.

}

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/talgalili/gplots/issues/8, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHOJBWOT2UY6UP3KMVZ3OLULMFKPANCNFSM5HZHL5CA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

dr-joe-wirth commented 2 years ago

Yes I will attempt to recreate the bug with mtcars and then post it here.

talgalili commented 2 years ago

Thanks.

On Thu, 11 Nov 2021, 18:52 Joe Wirth, @.***> wrote:

Yes I will attempt to recreate the bug with mtcars and then post it here.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/talgalili/gplots/issues/8#issuecomment-966459224, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHOJBXYY2TNSEYQCDCBLXTULPYDVANCNFSM5HZHL5CA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

dr-joe-wirth commented 2 years ago

Hello,

My problem involved a square matrix (all-pairwise-comparisons), and none of the built-in data sets seemed to have this feature. Instead, I made a very small matrix (4x4) to illustrate the problem I am getting.

library(ape)
library(DECIPHER)
library(dendextend)
library(gplots)

A.col = c(100,95,70,80)
B.col = c(95,100,80,70)
C.col = c(70,80,100,60)
D.col = c(80,70,60,100)
test.df <- data.frame(A=A.col, B=B.col, C=C.col, D=D.col, row.names=c("A","B", "C", "D"))

# create newick file
newickStr <- "(((C:2.0,D:2.0):3.0,B:1.0):5.0,A:1.0);"
tree <- read.tree(text=newickStr)
newickFH  <- tempfile()
write.tree(tree, file=newickFH)

# open newick file as a dendrogram
dendro <- ReadDendrogram(newickFH)

# convert to matrix, make sure only necessary values present
test.mat <- as.matrix(test.df[which(rownames(test.df) %in% labels(dendro)), which(colnames(test.df) %in% labels(dendro))])

# reorder matrix to match labels of the dendrogram
test.mat <- test.mat[order(match(rownames(test.mat), labels(dendro))), order(match(colnames(test.mat), labels(dendro)))]

# this produces the re-ordered matrix as expected
heatmap.2(test.mat, Rowv=FALSE, Colv=FALSE, dendrogram='none')

# this produces a matrix that is no longer correlated with the dendrogram
heatmap.2(test.mat, Rowv=dendro, Colv=dendro)

I hope this helps. The first call to heatmap.2 produces the heatmap with the expected ordering. The second call illustrates the bug that I have encountered when trying to use a custom dendrogram.

dr-joe-wirth commented 2 years ago

Hello,

I was wondering if there is any news regarding this bug.

Best, Joe W