Closed Shicheng-Guo closed 2 years ago
Hi Shicheng,
Table 1 in the SAIGE paper shows the computation cost for GWAS of single-variant association tests. https://www.nature.com/articles/s41588-018-0184-y/tables/1 On average, for 400k UKBB imputation (based on array) data, it cost ~600 CPU hrs on google cloud.
To conduct exome-wide gene-based tests, STable 1 in the SAIGE-GENE paper has the computation cost for ~400K sample. For Step 1, computation time is O(N^1.5) and for Step 2, computation time is O(N), where N is the sample size. https://static-content.springer.com/esm/art%3A10.1038%2Fs41588-020-0621-6/MediaObjects/41588_2020_621_MOESM1_ESM.pdf
Note that the the converge speed of the algorithm to fit the null mixed models (Step 1) varies by phenotypes. Step2 with bgen or sav input is faster then with VCF input.
Thanks, Wei
Dear Dr. Zhou,
I am wondering how to estimate the cost for each GWAS in AWS with UKBB (500K array data or 50K WES data). Did you have any estimation before since you already have UKB-SAIGE-PheWAS.
Thanks.
Shicheng