saigegit / SAIGE

Development for SAIGE and SAIGE-GENE(+)
GNU General Public License v3.0
64 stars 27 forks source link

Should I use imputed data for the plink file in step 1 of SAIGE instead of the diierct genotype data of the SNP array? #108

Open Apprentice2 opened 1 year ago

Apprentice2 commented 1 year ago

I built GRM with the following commands to perform SAIGE and ran step1. The plink file is SNP data with MAF≥1% typed directly from the SNP array. The SNP data is for 1000 individuals.

#! /bin/bash cpu=90 trait=SBP conda activate saige mkdir output createSparseGRM.R \ --plinkFile=./mygeno \ --nThreads=${cpu} \ --outputPrefix=./output/sparseGRM \ --numRandomMarkerforSparseKin=2000 \ --relatednessCutoff=0.125 step1_fitNULLGLMM.R \ --plinkFile=./mygeno \ --sparseGRMFile=./output/sparseGRM_relatednessCutoff_0.125_2000_randomMarkersUsed.sparseGRM.mtx \ --sparseGRMSampleIDFile=./output/sparseGRM_relatednessCutoff_0.125_2000_randomMarkersUsed.sparseGRM.mtx.sampleIDs.txt \ --useSparseGRMtoFitNULL=TRUE \ --phenoFile=./chrs.imputed.rehead.dose.phe.txt \ --phenoCol=${trait} \ --covarColList=Sex,PC1,PC2 \ --qCovarColList=PC1,PC2 \ --sampleIDColinphenoFile=IID \ --invNormalize=TRUE \ --traitType=quantitative \ --nThreads=${cpu} \ --IsOverwriteVarianceRatioFile=TRUE \ --isCateVarianceRatio=TRUE \ --outputPrefix=./output/${trait}_sparseGRM_temo However, after the step 1 command, I get the following error message ERROR! number of genetic variants in 10< MAC <= 20.5 is lower than 30 Please include more markers in this MAC category in the plink file ----------- The plink file in step 1 is required to be a hard call genotype. I used directly typed data from SNP arrays because I thought that high-quality data should be used for the step 1. On the other hand, data of low frequency SNPs in SNP array data are low quality. I also have imputed data from same samples. The reference panel is 1000 genomes phase 3 all ancestries. This data is in vcf format, should I convert it to a plink file and use it to build the GRM and perform step 1 without the MAF filter? I have checked the following post and have not received a clear answer to this question, so I am posting it again. I would appreciate it if you could enlighten me. https://github.com/weizhouUMICH/SAIGE/issues/226