Open Hepit opened 1 year ago
Hi,
The p-values should be the same when step 1 LOCO predictions is all 0. Are you running the step 2 test on the same chromosome that was used in step 1 single chromosome run? (while the LOCO predictions for that chromosome will be zero, it will not be the case for the remaining chromosomes) And how different are the p-values (there can be small epsilon difference due to numerical error).
Cheers, Joelle
Thanks for your response ! To your first question, yes, I am indeed running the step2 on the same chr used to create the single-chr step1 file. (As a confirmation, I re-did this step again with an artificially fully-zeroed-out step1 file, and obtained the exact same p-values.)
As for the difference between the p-values from this single-chr run and those from the --ignore-pred run, it could sometimes reach up to 5%; I'm not sure how high the epsilon difference can go? I'm attaching the actual p-value numbers for a few hundred variants for these two runs (as well as for a third run, correctly done with whole-genome step1) so you can take take a look for yourself if you like.
That's strange, the .loco should have all 0s for that chromosome solely included in the step 1 run.
Can you send a link to the .loco file you are using as well as the REGENIE logs for Step 1 & step 2 for the single chr run (with --pred and --ignore-pred)?
Just to be clear, the .loco does have all 0s for that chr. The numbers linked in my previous comment were p-values.
Logs : step1_chr9.log step2_chr9.log step2_nopred.log
(Although the step2 logfiles indicate step 2 was run over the whole genome, only chr9 snps were used for the p-value comparison.)
The loco files are too large to attach, but here are screenshots to get an idea of their form : step1_chr9_1 loco step1_chr9_2 loco
And the artificial loco file I used and that also yielded the same results (for chr9 snps) as the chr9-only loco file above : step1_chr9_1 loco_zeroed
Hi,
I have checked this using toy data in the example/
directory and am getting unchanged p-values between single chrom step 1 runs and ignore-pred runs. Can you double check the LOCO predictions for chr9 are all 0s in "step1chr9[12].loco"? e.g. sed '10q;d' step1_chr9_[12].loco | tr ' ' '\n' | sort -u
Also, I will gladly take a look if you can send the input files you are using to get these results.
Cheers, Joelle
Hi,
I wish I could provide the input files, unfortunately I can't share UK Biobank files. I can however attach my code and logs. What I'm attaching here is the whole operation redone with chr 21 instead of chr 9, but the observation is the same. I'm also attaching a condensed version of the output instead of the whole .regenie files, for space reasons, and the script used to make it. (Also included is a "correct" run with all chrs used in step1, just for comparison.)
I did double-check, the loco predictions for chr9 (or, here, chr21) are indeed 0 across the whole row.
Scripts : step1_singlechr.sh.txt step1_allchrs.sh.txt
step2_singlechr.sh.txt step2_nopred.sh.txt step2_allchrs.sh.txt
P-value comparison : compare.r.txt pval_comparison.txt
Logs : step1_singlechr_chr21.log step1_allchrs.log step2_singlechr_chr21.log step2_allchrs.log step2_nopred.log
Hello,
I recently ran regenie on a genome-wide dataset for a binary trait, mistakenly running step1 and then step2 on each chromosome individually (and later understanding I should be running step1 on a whole-genome file.) My understanding is that regenie uses the step1 model as a sort of offset applied to the regression in step2. From the docs,
So if an empty step1 .loco file (ie, one where the relevant row for the chr of interest is zeroed out) is passed to step2, as it was in my analysis, I would understand that offset to be 0, and thus expect step2 to correspond to a standard, vanilla logistic regression. However, this is not the case : I went back and re-ran step2 with the --ignore-pred flag (which "corresponds to simple linear/logistic regression", according to here), and got different P-values in the output.
So here are my questions :
Thanks a lot for your help !