saigegit / SAIGE

Development for SAIGE and SAIGE-GENE(+)
GNU General Public License v3.0
64 stars 27 forks source link

Error running SAIGE using unimputed data #130

Open zrayw opened 8 months ago

zrayw commented 8 months ago

Dear Developer,

We were trying to run an older SAIGE (0.45) on unimputed data. In SAIGE step 2, We imposed IsDropMissingDosages=TRUE since I don't have any imputed genotypes. We met the following error: Error in solve.default(XVX) : system is computationally singular: reciprocal condition number = 6.31709e-20 Calls: SPAGMMATtest ... ScoreTest_NULL_Model -> solve -> solve -> solve.default

Would it be the reason that so many genotypes were discarded? Do you have any suggestions running SAIGE on unimputed data? Btw, we could successfully run SAIGE with sDropMissingDosages=FALSE, but in that case SAIGE would impute missingness with mean values and cause bias.

Thanks so much!!

mattimpat commented 8 months ago

It seems the problem is caused by SAIGE calculating an equivalent MAF for the MAC, based on the total number of samples. However, this does not work when there are many missing genotypes, since the equivalent MAF threshold can be met, even when the MAC one is not. It can be fixed by checking the MAC threshold for each variant, rather than the equivalent MAF.