Closed raonyguimaraes closed 1 year ago
Hi there, just answering my own question for now... I did LD-pruning (and maf 0.01) and now I'm left with 1.2M variants. So I will try to run step 1 using that for now. Please let me know if have any suggestions about running step1 and step2 parameters for WGS dataset.
regenie \ --step 1 \ --bed all_hg38_qc4_auto \ --covarFile covariantes \ --phenoFile phenotype_bin.txt \ --bsize 1000 --force-step1 \ --bt --strict \ --out fit_bin_out --loocv
For example, to filter out SNPs with minor allele frequency (MAF) below 1%, minor allele count (MAC) below 100, genotype missingess above 10% and Hardy-Weinberg equilibrium p-value exceeding 10−15, and samples with more than 10% missingness From https://rgcgithub.github.io/regenie/recommendations/
The recommendation is for UKBB analysis but might help cut your number some more?
I used --maf 0.01 --mac 100 --geno 0.1 --hwe 1e-15 --mind 0.1. --cv 5 was not converging on step1 so I used --loocv and it took only 2h to run using:
--bsize 1000 \ --force-step1 \ --bt \ --strict \ --out fit_bin_out \ --loocv
Regenie is looking really good. Thank you for the help!
Hi,
Indeed, we recommend using ~500k good quality variants (ie directly genotyped SNPs) in step 1. As you pointed out, LOOCV will be more efficient than K-fold CV as it only requires fitting a model once for each ridge parameter instead of K times with K-fold CV (this is really more burdensome as you have quite a large number of input variants [so number of level 0 predictions will be very large]).
Cheers, Joelle
Hi there,
I have a dataset with 37k WGS, after doing my QC and using a MAF of 0.01 I am left with around 8M variants that I would like to use for running regenie. I'm a bit confused on how to run step 1. I read in the docs that I should have no more than 1M variants for step1, so how should I go around this? Should I do LD pruning, or increase my maf to something like 0.2 till I get less than 1M variants, or even do it in blocks and only test the genes present in that block during step 2?
Thanks for the help!