Closed crazyhottommy closed 8 years ago
This line:
clustering_distance_rows = function(m) mahalanobis(m, center = FALSE, cov = cov(m)),
only needs m
to be defined, thus one argument for this function is enough. You don't need to pass center
and cov
like function(m, center, cov)
, you can just put these two in mahalanobis()
, as you wrote. Or you can understand in this way: the value of clustering_distance_rows
is not directly mahalanobis
while is a modified version of mahalanobis
which has a fixed value of center
and cov
.
mahalanobis2 = function(m) {
cov = cov(m)
mahalanobis(m, center = FALSE, cov = cov)
}
Heatmap(...,
clustering_distance_rows = mahalanobis2
)
On the other hand, you can perform clustering on rows/columns first, then pass the clustering object (e.g. a hclust
or dendrogram
object) to cluster_rows
or cluster_columns
:
row_dist = mahalanobis(Y[genes.3PC, ], center =FALSE, cov = cov(Y[genes.3PC, ]))
row_hclust = hclust(row_dist)
Heamtap(..., cluster_rows = row_hclust)
The error you shown is not because the ComplexHeatmap package, it is because the matrix which is involved in mahalanobis distance calculation is singular.
Thanks very much for your detailed answer. I understand it now. Maybe off topic of ComplexHeatmap
, but do you have any idea on the singular error? Given a gene expression matrix, how to choose genes for heatmap, and do bi-clustering to avoid this singular error?
I googled around and found http://stackoverflow.com/questions/21451664/system-is-computationally-singular-error
Thanks!
Sorry I cannot help you with this. I have very limited knowledge of matrix algebra. But in the case of gene expression, I think mahalanobis method can only be used on columns because normally number of rows in a gene expression matrix is larger than the number of samples. mahalanobis()
use solve()
to calculate the inverse of the covariance matrix. You can check the help page of solve()
function.
thanks for the suggestion. I should only calculate for columns (samples).
Hi,
According to this post https://liorpachter.wordpress.com/2014/01/19/why-do-you-look-at-the-speck-in-your-sisters-quilt-plot-and-pay-no-attention-to-the-plank-in-your-own-heat-map/
mahalanobis distance is good for high variance high expressed genes. How can I use it in
ComplexHeatmap
?I googled around:
From the documentation
There are three ways to specify distance metric for clustering:
mahalanobis()
has to specify at least 2 arguments.m
matrix itself and a covariance matrixcov
. It will be better to letComplexHeatmap
receives any number of arguments, so I can putcenter
=FALSE as well.Thanks! Tommy