saigegit / SAIGE

Development for SAIGE and SAIGE-GENE(+)
GNU General Public License v3.0
64 stars 27 forks source link

Longer step 1 runtimes when compared to SAIGE 0.44.2 #57

Closed dvg-p4 closed 1 year ago

dvg-p4 commented 1 year ago

I did some benchmarking of SAIGE 1.1.6.1 vs SAIGE 0.44.2 and full vs sparse GRM for step 1 runtimes. I found that step 1 was significantly faster with a sparse GRM, as expected. However, the latest version of SAIGE appears to be consistently slower than 0.44.2 for step 1 computation with a sparse GRM (see graphs below). Have there been any changes to the step 1 algorithm that could account for this slowdown?

image image

saigegit commented 1 year ago

Hi @dvg-p4,

Thanks so much for sharing the benchmark results! When using a full GRM to fit the null model, in 0.44.2, genotypes in the plink file are read to one vector and then used for both null model and variance ratio estimation. In 1.1.6.1, genotypes in the plink file are read to two vectors and used for null model and variance ratio estimation, respectively, so the two sub-steps do not use overlapped markers. Can you please check if "reading in genotypes" takes longer in 1.1.6.1? The time can be extracted from the log file. When using a sparse GRM, in 0.44.2, the variance ratio step is skipped so it could be faster than in 1.1.6.1.

Thanks, Wei