weizhouUMICH / SAIGE

GNU Lesser General Public License v3.0
188 stars 73 forks source link

Harware requirements and recommendations #259

Closed olszewskip closed 2 years ago

olszewskip commented 3 years ago

What's the best way to spend my money on hardware, if I'd like to run SAIGE-GENE on a plink file with 2k samples and a WES data after QC, preferably using docker? How many cpu cores can be utilised in a single analysis? How much RAM is required? What is the rough estimate of the running time and its scaling with more cores and more RAM? Any tips would be greatly appreciated.

weizhouUMICH commented 3 years ago

Hi, For Step 1, which is to fit the null model, the memory usage is around M*N/8 bytes, where M is the number of markers in the plink file and N is the number of samples. Step 1 can use multiple CPUs

For Step 2, the memory usage is not large and this step just stores the step 1 result and the dosage/genotypes for the testing marker/region. Benchmark results using UKBB data can be found in the Supplementary Table 1 and 2.

https://static-content.springer.com/esm/art%3A10.1038%2Fs41588-020-0621-6/MediaObjects/41588_2020_621_MOESM1_ESM.pdf

Thanks, Wei

On Thu, Oct 22, 2020 at 6:10 AM Paweł Olszewski notifications@github.com wrote:

What's the best way to spend my money on hardware, if I'd like to run SAIGE-GENE on a plink file with 2k samples and a WES data after QC, preferably using docker? How many cpu cores can be utilised in a single analysis? How much RAM is required? What is the rough estimate of the running time and its scaling with more cores and more RAM? Any tips would be greatly appreciated.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/weizhouUMICH/SAIGE/issues/259, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACL52L5FE4GEWEQDDVJYIF3SMAAI3ANCNFSM4S27HZVA .

weizhouUMICH commented 2 years ago

We have just released a new version 1.0.0. It has computational efficiency improvements for both Step 1 and Step 2 for single-variant and set-based tests. We have created a new program github page https://github.com/saigegit/SAIGE with the documentation provided https://saigegit.github.io/SAIGE-doc/ The program will be maintained by multiple SAIGE developers there. The docker image has been updated. Please feel free to try the version 1.0.0 and report issues if any.

Thanks! Wei