Cholesky / LAPACK Error

behinger commented 3 years ago

(this is actually a csv, but due to github restrictions I had to change it to txt) - 1mb bug.txt

using CSV
using DataFrames
using MixedModels
using MixedModelsPermutations
using Random

path = "https://github.com/palday/MixedModelsPermutations.jl/files/6024516/bug.txt"
evt = CSV.read(download(path),DataFrame)
evt.subject = categorical(evt.subject)

#f  = @formula y ~ 1 + category * condition * baseline + zerocorr(1 + category * condition | subject)
f  = @formula y ~ 1 + category * condition * baseline + (1 + category * condition | subject)
m2 = fit(MixedModel, f, evt)

perm = permutation(MersenneTwister(1), 1, m2)

This results sometimes* in ERROR: PosDefException: matrix is not positive definite; Cholesky factorization failed. on the last line or ERROR: LinearAlgebra.LAPACKException(4). It works with ZeroCorr** though

** The simplest formula that fails for me: f = @formula y~1 + (1+category&condition|subject)- but that one also fails with zerocorr

* Sometimes, not sure exactly when. E.g. when I dropped "baseline" I got the LAPACKException, but I think with the upper model I also got the LAPACKException before. Not sure if that matters, hints to similar problem.

Edit: I found another dataset where this occurs just with (1|subject). The actual variance estimate of the model is "0" in that case - thus I guess it makes lot's of sense, that the permutation has trouble. Not sure how I should proceed, because more-or-less random data will always exist in the baseline period (or shortly before activity onset)

palday commented 3 years ago

Re your data origins: I'm not sure it makes sense to include the baseline intervals in the window for permutation tests.

behinger commented 3 years ago

yes, but I have same/similar problems at least until 100ms after stim onset

palday commented 3 years ago

The problem occurs before we even get to the permutations:

julia> using MixedModelsPermutations: inflation_factor
julia> inflation_factor(m2)
ERROR: LAPACKException(4)

Running the steps from that function:

julia> using Statistics, LinearAlgebra
julia> σ = sdest(m2);
julia> σres = std(residuals(m2); corrected=false);
julia> trm, re = first(m2.reterms), first(ranef(m2));
julia> λmle =  trm.λ * σ;
julia> λemp = cholesky(cov(re'; corrected=false)).L
ERROR: PosDefException: matrix is not positive definite; Cholesky factorization failed.

The covariance matrix is indeed rank deficient

julia> rank(cov(re'; corrected=false))
3

julia> m2.rePCA
(subject = [0.6488404385966617, 0.9132263937810167, 1.0, 1.0],)

Models with the correlation parameter zeroed out work because it forces a certain correlation and thus a certain covariance matrix.

julia> fzc = @formula y ~ 1 + category * condition * baseline + zerocorr(1 + category * condition | subject);
julia> m2zc = fit(MixedModel, fzc, evt);
julia> m2zc.rePCA
(subject = [0.25000000000000006, 0.5000000000000001, 0.7500000000000001, 1.0],)

julia> inflation_factor(m2zc)
2-element Vector{Any}:
  [0.9772834340644585 0.0 0.0 0.0; 0.07273963059996436 0.8963867476928497 0.0 0.0; -0.24532517740979073 0.45437097494260126 0.7722095624800227 0.0; -0.14632129083619883 -0.35293923152747453 0.15872507826330157 0.42987670596391353]
 1.0054889079461018

Alternatively, you can drop the interaction term to reduce the rank of the covariance matrix:

julia> f_not_maximal = @formula y ~ 1 + category * condition * baseline + (1 + category + condition | subject);
julia> m2_not_maximal = fit(MixedModel, f_not_maximal, evt);
julia> m2_not_maximal.rePCA # notice that this vector is length 3
(subject = [0.5525802747460526, 0.8857211980484779, 1.0],)

julia> inflation_factor(m2_not_maximal)
2-element Vector{Any}:
  [0.9784108356832492 0.0 0.0; 0.04833800447889176 0.9078979595219956 0.0; 0.0028553489490354077 0.13497146496473686 0.8652989947068537]
 1.004688491366803

The underlying problem seems to be the old "maximal models are a bad idea" thing.

It works without the interaction term in the random effects because the conditions aren't exactly identical in the baseline time window (which is why we baseline correct to begin with!). However, the interaction term is the between-subject difference of (the difference between conditions (interaction) of the difference between conditions (main effect)) for something where the deepest difference to be just some noise, so it's not surprising that the estimated top-level difference collapses to zero.

jaromilfrossard commented 3 years ago

An other approach to this problem is that one covariance matrix is not positive definite.

In the inflation function, for this dataset, I think it is the covariance matrix estimated from the model: λmle = trm.λ * σ

Maybe, the problem could be solved using the "near positive definite" algorithm. However, I don't really know how it works, but it could do the job.

using ProximalOperators
x = λmle *λmle'
x,_ = prox(IndPSD(),x)
isposdef(x)
x,_ = prox(IndPSD(),x)
isposdef(x)
cholesky(x).L -λmle

However, I am not sure if it is a safe solution. As the resulting positive definite matrix may be different from the initial λmle *λmle'

palday commented 3 years ago

Thanks @jaromilfrossard ! Two furhter thoughts:

Making Cholesky work

It's possible to compute the pivoted Cholesky factor on a positive semidefinite matrix. We then apply that same pivot to the MLE estimate, compute the scaling factor, then undo the pivot. I think I might have tried (part of) this and it didn't work, but maybe it's something to revisit.

Side stepping positive definiteness

One thing I discussed with @behinger is a somewhat more nonparametric approach that uses the mixed model to determine the permutation structure, but essentially works as a permutation of the within-participants/items OLS estimates:

don't use inflation / set inflation factor to be the identity
use OLS ranef
use residuals based on OLS ranef.

The mixed model would still be used for computing the test statistics and determining the permutation blocking structure, but permutation itself is much closer to a ter Braak permutation scheme on OLS models fitted within groups.

I haven't tested this idea yet.

palday / MixedModelsPermutations.jl

Cholesky / LAPACK Error #19

Making Cholesky work

Side stepping positive definiteness