projectglow / glow

An open-source toolkit for large-scale genomic analysis
https://projectglow.io
Apache License 2.0
262 stars 106 forks source link

Comparision of performance of regenie WGR and glow WGR #545

Closed hguturu closed 2 weeks ago

hguturu commented 6 months ago

Are there any comparisons of the two methods? In https://static-content.springer.com/esm/art%3A10.1038%2Fs41588-021-00870-7/MediaObjects/41588_2021_870_MOESM1_ESM.pdf, I see a discussion of the implementation, but no comparison of two implementations similar to how regenie is compared against various other algorithms.

kermany commented 4 months ago

wgr implementation is a distributed version of regenie. Most of the perf gain comes in the second step (regression). for phenotype imputation, I'd say you can use either method. the regression side is near linear perf ~ compute cluster size

hguturu commented 4 months ago

@kermany good to hear from you :D.

That makes sense that most of the gains are from the regression step. I am less concerned about the performance since I agree the performance gains, and more concerned about the equivalence of results between the two implementations.

I did an overlap of the pQTLs found by Glow vs Regenie C++ and there was very little overlap. Similarly the effect sizes and pvalues were uncorrelated.