FINNGEN / autoreporting

MIT License
0 stars 1 forks source link

Calculate LD as an as-needed basis, instead of precalculating it for everything #178

Closed Lipastomies closed 3 years ago

Lipastomies commented 3 years ago

Currently when LD clumping or credible set clumping, we precalculate the LD for possible lead variants. This is not feasible when there are lots of results, as precalculating the LD for e.g. 20_000 variants is going to crash quite fast. Instead, let's calculate the LD on a as-needed basis. This should greatly reduce the amount of LD calculated, as many possible lead variants are being removed from the possible lead variant pool when even a single group is formed (because they are in high LD with each other, since the signals are formed in peaks). In testing even the fact that PLINK needs to subset the imputation panel by chromosome and is quite slow in calculating LD does not matter that much, since the amount of times we need to calculate LD is the amount of groups, and we have only a few groups.

Lipastomies commented 3 years ago

Done