OpenMendel / SnpArrays.jl

Compressed storage for SNP data
https://openmendel.github.io/SnpArrays.jl/latest
Other
44 stars 9 forks source link

Out of memory with grm #122

Closed espenmei closed 1 year ago

espenmei commented 1 year ago

Hi,

Thanks for a great package!

I'm trying to compute a grm for a genotype matrix of 100,000 individuals and 1,000,000 snps. I have enough memory available to hold the final grm, but I run out of memory during the computation. Now I try to compute the grm in batches, by using the cinds argument in the grm function and accumulating the results over the batches. This seems to work for some test cases, but it still allocates too much when I try to use it on the whole genotype matrix. I have no idea how to choose number of batches so this has been pure guessing.

I'm wondering if there are any built-in routines to deal with this type of situations in SnpArrays that I have missed, or if you have any suggesting for how to approach this type of problem?

Sorry if this is not the right forum for these types of practical problems. All the best, Espen

Hua-Zhou commented 1 year ago

The result GRM will be 100,000-by-100,000, occupying about 80GB of memory. How much remaining memory do you have? Assume your computer has 128GB RAM; then there are around 48GB left. In this case, I'll use a block size of say 10,000. Instead of using the grm function, let's write a loop to accumulate the GRM matrix manually in a more memory-efficient way. For each block, we first convert the 100,000-by-10,000 block of SnpArray to a Float64 matrix G, taking about 8GB. Then we calculate G * G' and add it to the GRM matrix. The whole procedure should use around 90GB of memory.

The following pseudo-code assumes block_size divides the number of SNPs.

using SnpArrays, LinearAlgebra

# assume S is a 100,000-by-1,000,000 SnpArray
m, n = size(S)
# number of blocks
block_size = 10_000
B = n  /  block_size
# normalizing constant in GRM
α = inv(2n)
# pre-allocate GRM matrix and block genotype matrix G
Φ = zeros(m, m)
G = zeros(m, block_size)
for b in 1:B
    colinds = ((b - 1) * block_size + 1):(b * block_size)
    # converte SnpArray block to Float64 matrix
    @views Base.copyto!(G, S[:, colinds], model=ADDITIVE_MODEL, impute=true, center=true, scale=true)
    # accumulate GRM (only upper triangular part)
    BLAS.syrk!('U', 'N', α, G, 1.0, Φ)
end
# copy upper triangular part to the lower triangular part
LinearAlgebra.copyto!(Φ, 'U')
espenmei commented 1 year ago

Well, that worked excellent! Thanks a lot for both the code and reasoning about memory usage. I could not get LinearAlgebra.copyto!(Φ, 'U') To do what intended. But I ended up returning a symmetric view Symmetric(Φ, :U) as that how I use it further. Thanks for the great work with this package.

Hua-Zhou commented 1 year ago

Sorry LinearAlgebra.copyto!(Φ, 'U') is a typo. It should be LinearAlgebra.copytri!(Φ, 'U').

I'm glad your problem is solved.