Closed pjotrp closed 3 years ago
I thought lmm 1 is the wald test, 2 is the likelihood ratio test, 3 is the score test, and perhaps 4 is all three tests?
Yes, you are right
-lmm [num] specify analysis options (default 1).
options: 1: Wald test
2: Likelihood ratio test
3: Score test
4: 1-3
5: Parameter estimation in the null model only
That column of zeroes should not be there :)
If I run the 4 options with
for x in 1 2 3 4 ; do ./bin/gemma-0.98.3-linux-static -lmm $x -bfile example/mouse_hs1940 -k output/mouse_hs1940_data.cXX.txt -o lmm$x
the result is
head -3 output/lmm?.assoc.txt
==> output/lmm1.assoc.txt <==
chr rs ps n_miss allele1 allele0 af beta se logl_H1 l_remle p_wald
1 rs3683945 3197400 0 A G 0.443 -7.965330e-02 6.185099e-02 -1.582126e+03 4.318993e+00 1.980182e-01
1 rs3707673 3407393 0 G A 0.443 -6.654282e-02 6.210234e-02 -1.582377e+03 4.316144e+00 2.841271e-01
==> output/lmm2.assoc.txt <==
chr rs ps n_miss allele1 allele0 af logl_H1 l_mle p_lrt
1 rs3683945 3197400 0 A G 0.443 -1.583892e+03 4.313353e+00 1.977836e-01
1 rs3707673 3407393 0 G A 0.443 -1.584148e+03 4.310280e+00 2.840289e-01
==> output/lmm3.assoc.txt <==
chr rs ps n_miss allele1 allele0 af beta se logl_H1 p_score
1 rs3683945 3197400 0 A G 0.443 -7.965744e-02 6.191532e-02 0.000000e+00 1.984062e-01
1 rs3707673 3407393 0 G A 0.443 -6.650240e-02 6.217649e-02 0.000000e+00 2.848479e-01
==> output/lmm4.assoc.txt <==
chr rs ps n_miss allele1 allele0 af beta se logl_H1 l_remle l_mle p_wald p_lrt p_score
1 rs3683945 3197400 0 A G 0.443 -7.965330e-02 6.185099e-02 -1.583892e+03 4.318993e+00 4.313353e+00 1.980182e-01 1.977836e-01 1.984062e-01
1 rs3707673 3407393 0 G A 0.443 -6.654282e-02 6.210234e-02 -1.584148e+03 4.316144e+00 4.310280e+00 2.841271e-01 2.840289e-01 2.848479e-01
you can see logl_H1 differs for lmm 2 and is not computed with lmm 3. It was code introduced in gemma 0.95alpha. See https://github.com/genetics-statistics/GEMMA/blame/84360c191f418bf8682b35e0c8235fcc3bd19a06/src/lmm.cpp. I added the logl_H1 output about 3 years ago with #92
I also note the time to compute varies quite a bit for the lmm 1-4 versions. lmm 4 being the worst for doing it all:
for x in 1 2 3 4 ; do time ./bin/gemma-0.98.3-linux-static -lmm $x -bfile example/mouse_hs1940 -k output/mouse_hs1940_data.cXX.txt -o lmm$x ; done
real 0m9.939s
user 0m14.895s
sys 0m2.632s
real 0m9.939s
user 0m15.030s
sys 0m2.255s
real 0m2.837s
user 0m8.103s
sys 0m3.003s
real 0m17.482s
user 0m22.239s
sys 0m2.422s
Anyway, from the code -lmm 3
does not compute logl_H1. I'll remove that from the output.
Describe the bug
Running -lmm [1-3] gives the following output:
At least one expects lmm3 to show all 3 calculations! Probably a regression on my end.