Open behinger opened 3 years ago
Re your data origins: I'm not sure it makes sense to include the baseline intervals in the window for permutation tests.
yes, but I have same/similar problems at least until 100ms after stim onset
The problem occurs before we even get to the permutations:
julia> using MixedModelsPermutations: inflation_factor
julia> inflation_factor(m2)
ERROR: LAPACKException(4)
Running the steps from that function:
julia> using Statistics, LinearAlgebra
julia> σ = sdest(m2);
julia> σres = std(residuals(m2); corrected=false);
julia> trm, re = first(m2.reterms), first(ranef(m2));
julia> λmle = trm.λ * σ;
julia> λemp = cholesky(cov(re'; corrected=false)).L
ERROR: PosDefException: matrix is not positive definite; Cholesky factorization failed.
The covariance matrix is indeed rank deficient
julia> rank(cov(re'; corrected=false))
3
julia> m2.rePCA
(subject = [0.6488404385966617, 0.9132263937810167, 1.0, 1.0],)
Models with the correlation parameter zeroed out work because it forces a certain correlation and thus a certain covariance matrix.
julia> fzc = @formula y ~ 1 + category * condition * baseline + zerocorr(1 + category * condition | subject);
julia> m2zc = fit(MixedModel, fzc, evt);
julia> m2zc.rePCA
(subject = [0.25000000000000006, 0.5000000000000001, 0.7500000000000001, 1.0],)
julia> inflation_factor(m2zc)
2-element Vector{Any}:
[0.9772834340644585 0.0 0.0 0.0; 0.07273963059996436 0.8963867476928497 0.0 0.0; -0.24532517740979073 0.45437097494260126 0.7722095624800227 0.0; -0.14632129083619883 -0.35293923152747453 0.15872507826330157 0.42987670596391353]
1.0054889079461018
Alternatively, you can drop the interaction term to reduce the rank of the covariance matrix:
julia> f_not_maximal = @formula y ~ 1 + category * condition * baseline + (1 + category + condition | subject);
julia> m2_not_maximal = fit(MixedModel, f_not_maximal, evt);
julia> m2_not_maximal.rePCA # notice that this vector is length 3
(subject = [0.5525802747460526, 0.8857211980484779, 1.0],)
julia> inflation_factor(m2_not_maximal)
2-element Vector{Any}:
[0.9784108356832492 0.0 0.0; 0.04833800447889176 0.9078979595219956 0.0; 0.0028553489490354077 0.13497146496473686 0.8652989947068537]
1.004688491366803
The underlying problem seems to be the old "maximal models are a bad idea" thing.
It works without the interaction term in the random effects because the conditions aren't exactly identical in the baseline time window (which is why we baseline correct to begin with!). However, the interaction term is the between-subject difference of (the difference between conditions (interaction) of the difference between conditions (main effect)) for something where the deepest difference to be just some noise, so it's not surprising that the estimated top-level difference collapses to zero.
An other approach to this problem is that one covariance matrix is not positive definite.
In the inflation
function, for this dataset, I think it is the covariance matrix estimated from the model: λmle = trm.λ * σ
Maybe, the problem could be solved using the "near positive definite" algorithm. However, I don't really know how it works, but it could do the job.
using ProximalOperators
x = λmle *λmle'
x,_ = prox(IndPSD(),x)
isposdef(x)
x,_ = prox(IndPSD(),x)
isposdef(x)
cholesky(x).L -λmle
However, I am not sure if it is a safe solution. As the resulting positive definite matrix may be different from the initial λmle *λmle'
Thanks @jaromilfrossard ! Two furhter thoughts:
It's possible to compute the pivoted Cholesky factor on a positive semidefinite matrix. We then apply that same pivot to the MLE estimate, compute the scaling factor, then undo the pivot. I think I might have tried (part of) this and it didn't work, but maybe it's something to revisit.
One thing I discussed with @behinger is a somewhat more nonparametric approach that uses the mixed model to determine the permutation structure, but essentially works as a permutation of the within-participants/items OLS estimates:
The mixed model would still be used for computing the test statistics and determining the permutation blocking structure, but permutation itself is much closer to a ter Braak permutation scheme on OLS models fitted within groups.
I haven't tested this idea yet.
(this is actually a csv, but due to github restrictions I had to change it to txt) - 1mb bug.txt
This results sometimes* in
ERROR: PosDefException: matrix is not positive definite; Cholesky factorization failed.
on the last line orERROR: LinearAlgebra.LAPACKException(4)
. It works with ZeroCorr** though** The simplest formula that fails for me:
f = @formula y~1 + (1+category&condition|subject)
- but that one also fails with zerocorr* Sometimes, not sure exactly when. E.g. when I dropped "baseline" I got the LAPACKException, but I think with the upper model I also got the LAPACKException before. Not sure if that matters, hints to similar problem.
Edit: I found another dataset where this occurs just with
(1|subject)
. The actual variance estimate of the model is "0" in that case - thus I guess it makes lot's of sense, that the permutation has trouble. Not sure how I should proceed, because more-or-less random data will always exist in the baseline period (or shortly before activity onset)