rgcgithub / regenie

regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.
https://rgcgithub.github.io/regenie
Other
189 stars 55 forks source link

SNP with low variance #568

Open lanjiangVUMC opened 1 week ago

lanjiangVUMC commented 1 week ago

I am new to Regenie and trying to test it on my dataset. here's my setting in step1: regenie \ --step 1 \ --bed chr22_prune \ --keep lowtg_samples.txt \ --covarFile pheno.txt \ --phenoFile pheno.txt \ --phenoCol lowtg \ --covarCol age,PC1,PC2 \ --bsize 100 \ --bt --lowmem \ --lowmem-prefix tmp_rg \ --out fit_bin_out

And the log file is Log of output saved in file : fit_bin_out.log

Options in effect: --step 1 \ --bed chr22_prune \ --keep lowtg_samples.txt \ --covarFile pheno.txt \ --phenoFile pheno.txt \ --phenoCol lowtg \ --covarCol age,PC1,PC2 \ --bsize 100 \ --bt \ --lowmem \ --lowmem-prefix tmp_rg \ --out fit_bin_out

Fitting null model

Chromosome 22 block [1] : 100 snps (3ms) -residualizing and scaling genotypes...done (7ms) -calc working matrices...done (2ms) -calc level 0 ridge...done (10ms) block [2] : 100 snps (3ms) -residualizing and scaling genotypes...done (6ms) -calc working matrices...done (2ms) -calc level 0 ridge...done (10ms) block [3] : 100 snps (3ms) -residualizing and scaling genotypes...done (5ms) -calc working matrices...done (2ms) -calc level 0 ridge...done (10ms) block [4] : 100 snps (2ms) -residualizing and scaling genotypes...ERROR: !! Uh-oh, SNP 22:10732714:C:T has low variance (=0.000000).

I calculated the allele frequencies for the subset of 6258 individuals and the minimal maf is 0.03. I am not sure why the program is giving error. What would be your suggestions? Thank you very much!

Ojami commented 1 week ago

If you're sure that min AAF in the subset of individuals analyzed by REGENIE is 0.03, please check if the problematic variant is highly correlated with your covariates (probably PCs). Test if this error happens if you remove PC1,PC2 from --covarCol.

lanjiangVUMC commented 1 week ago

I re-ran it without covariates and it still gave me the same type of error. The maf of the variant is 0.5. Will this be a problem? Thanks. 22:10732714:C:T C T 0.5 12516