mgalardini / pyseer

SEER, reimplemented in python 🐍🔮
http://pyseer.readthedocs.io
Apache License 2.0
109 stars 27 forks source link

More than two phenotypes (not-continuous) and multiple hypotheses correction #181

Closed VadimDu closed 2 years ago

VadimDu commented 2 years ago

Hello and thank you for the very useful tool,

I would be happy for your insight regarding two issues:

  1. I have >2 phenotypes, "categorical" groups. How to deal with such data in PySEER? It only accepts binary or continuous. If I convert the groups to continuous-like, the effect size and direction do not make sense, and no pairwise comparisons between pairs of phenotypes are done.
  2. The "lrt-pvalue" column in the results is p-value adjusted for population structure, but it is not adjusted for multiple hypotheses?

My solutions at the moment are to run PySEER several times, each time for a pair of phenotypes (as binary). Manually adjust the lrt-pvalue for FDR. What do you think?

Many thanks Vadim

johnlees commented 2 years ago

1) We don't do explicit multi-outcome modelling. The best way to deal with this is as you have already done, and run each categorical analysis separately. 2) That is correct. We generally leave it up to the user to conduct multiple testing adjustment as they see fit, but do provider a helper to correct for FWER: https://pyseer.readthedocs.io/en/master/usage.html#number-of-unique-patterns