Closed ya-guo closed 7 years ago
Matrix balancing is the name for the decomposition of a square matrix into a stochastic matrix (flat and equal row and column sums) and a set of balancing weights. The IC part of ICE is an algorithm for matrix balancing on a symmetric matrix like a Hi-C contact matrix. Algorithms for matrix balancing have been rediscovered several times in different fields for different purposes (e.g. in statistical modeling and numerical linear algebra). Lior Pachter even wrote an interesting blog post about it.
The balancing algorithm implemented in cooler is a sparse, parallel and out-of-core (i.e. works in chunks that fit in memory) version of the iterative correction method in Imakaev et al. One trivial difference is that the output balancing weights in cooler are multiplicative (thus, 1/bias as defined in the original paper). Implementing a sparse and out-of-core method was necessary for scaling up to very large high-resolution Hi-C data.
Do we need to correct the sequencing depth of the data by downsample or scale before doing the balance? Thanks
Could you explain what's the difference between a.k.a balancing and ICE(iterative correction and eigenvector decomposition), how to understand balancing, and whether the balancing used in the cooler is a upstate for Hi-C normalisation.