mgalardini / pyseer

SEER, reimplemented in python 🐍🔮
http://pyseer.readthedocs.io
Apache License 2.0
104 stars 25 forks source link

Using unitigs to predict with elastic net #179

Closed LeonardosMageiros closed 2 years ago

LeonardosMageiros commented 2 years ago

Hi!

I have performed GWAS analysis using the unitigs approach in my dataset. Is it possible to use for example the top 100 unitigs (with the lowest pvalue) to see how well they can predict my phenotype?

If yes, is there a way to a report the most important variants for the prediction outcome in case I just want to filter the top 10 for example?

Thank you in advance Leonardos

johnlees commented 2 years ago

You can do this yes. The simplest approach is to calculate a polygenic risk score: just use the betas from the top n most significant predictors in a linear model.

A better approach is would be to fit a model (such as an elastic net) to the selected predictors. But, if you are trying to maximise prediction accuracy, it would likely be best to fit a model to all of the unitigs, see: https://pyseer.readthedocs.io/en/master/predict.html