saigegit / SAIGE

Development for SAIGE and SAIGE-GENE(+)
GNU General Public License v3.0
64 stars 26 forks source link

Step2 conditional analysis with one variant #24

Open diogomribeiro opened 2 years ago

diogomribeiro commented 2 years ago

Hi there,

I'm using SAIGE v1.0.4 with conditional analysis. This works well when I have a set of variants, but when there is only one variant in the condition (e.g. --condition=1:1:C:T) I get this error: "Error in mainRegionInCPP(genoType, region$genoIndex_prev, region$genoIndex, : Error in function boost::math::cdf(const chi_squared_distribution&, double): Chi Square parameter was -nan, but must be > 0 ! Calls: SPAGMMATtest -> SAIGE.Region -> mainRegionInCPP"

Weirdly, if there is a single variant in the condition but it is an indel (e.g. --condition=1:1:CC:T), this works fine. I wonder if there is an issue when parsing the condition string.

If you need more information let me know. Thanks!

saigegit commented 2 years ago

Hi @diogomribeiro,

We haven't seen this error before. is 1:1:C:T in the testing set too? If true, conditioning markers on themselves sometimes causes problem, especially when the conditioning variance is very close to zero.

Thanks, Wei

diogomribeiro commented 2 years ago

Hi Wei, thanks for the reply. 1:1:C:T is not on the testing set (groupfile). It is only on the VCF. The problem seems to occur if there is only one variant in the condition. For instance, --condition=1:1:C:T does not work, but --condition=1:1:C:T,1:2:A:G works fine. And as I said before, just simply modifying the ID of the variant (1:1:C:T to 1:1:CC:T) without modifying the genotypes themselves also works. That is why I was thinking, it could be a problem with parsing the string?

Thanks, Diogo