Closed LeonardosMageiros closed 2 years ago
You can do this yes. The simplest approach is to calculate a polygenic risk score: just use the betas from the top n most significant predictors in a linear model.
A better approach is would be to fit a model (such as an elastic net) to the selected predictors. But, if you are trying to maximise prediction accuracy, it would likely be best to fit a model to all of the unitigs, see: https://pyseer.readthedocs.io/en/master/predict.html
Hi!
I have performed GWAS analysis using the unitigs approach in my dataset. Is it possible to use for example the top 100 unitigs (with the lowest pvalue) to see how well they can predict my phenotype?
If yes, is there a way to a report the most important variants for the prediction outcome in case I just want to filter the top 10 for example?
Thank you in advance Leonardos