OpenMendel / SnpArrays.jl

Compressed storage for SNP data
https://openmendel.github.io/SnpArrays.jl/latest
Other
44 stars 9 forks source link

A*x not correct when impute = true #91

Closed biona001 closed 3 years ago

biona001 commented 3 years ago

MWE

using Revise
using SnpArrays
using LinearAlgebra
using Random
using LoopVectorization
using MendelIHT
using Test

n = 4097
p = 10000
x = simulate_random_snparray(undef, n, p, min_ma=0) # no missing data
x[1, 1025] = 0x01 # missing

A = SnpLinAlg{Float64}(x, model=ADDITIVE_MODEL, impute=true, center=false, scale=false)
Atrue = convert(Matrix{Float64}, x, model=ADDITIVE_MODEL, impute=true, center=false, scale=false)
b = ones(p)
c = A * b
ctrue = Atrue * b
@test all(c .≈ ctrue) # this errors!

We can check visually that c != ctrue

julia> [c ctrue]
4097×2 Matrix{Float64}:
 5097.15  5097.92
 5021.0   5021.0
 5099.0   5099.0
 5004.0   5004.0
 5081.0   5081.0
 5008.0   5008.0
 5052.0   5052.0
 5024.0   5024.0
 5080.0   5080.0
 5138.0   5138.0
 5061.0   5061.0
 4956.0   4956.0
 5015.0   5015.0
 5036.0   5036.0
 5077.0   5077.0
 5053.0   5053.0
 5055.0   5055.0
    ⋮

This line (and many similar lines) is the culprit: when impute=true,μ[j] is looping from j in 1:cols. This will be fixed when my matrix-matrix mul routine is done (which will be in the next few days).

For now, do not use impute=true for SnpLinAlg.