stephenslab / susieR

R package for "sum of single effects" regression.
https://stephenslab.github.io/susieR
Other
174 stars 44 forks source link

Definition of purity of CS #117

Closed gaow closed 3 years ago

gaow commented 3 years ago

In SuSiE paper we defined purity as the minimum absolute correlation between variables in a CS. I cannot remember if it was a decision after discussions or it was my fault to do that; it feels like a natural choice (in fact we use correlation matrix as "LD" matrix in susie_rss). However this may cause a confusion when we report to people in genetics the community is used to seeing r^2 and thinking in terms of r^2. Our definition can cause a confusion.

Currently, the default purity cutoff to filter CS is 0.5 which is r^2 0.25 and is reasonable. We do have a switch squared in susie_get_cs currently set to False. If we change the behavior to True by default and change the cutoff to 0.25 that wont impact any previous anlaysis.

Perhaps maybe to keep it consistent with SuSiE paper definition we still call our current purity as is, and additionally also report r^2? What do we want to call that? Or, @stephens999 do you have other suggestions?

stephens999 commented 3 years ago

on reflection i think we should stick to how we defined it in the paper. Changing may create additional confusion.