mgalardini / pyseer

SEER, reimplemented in python 🐍🔮
http://pyseer.readthedocs.io
Apache License 2.0
111 stars 27 forks source link

Population structure not being corrected sufficiently #202

Closed smb20200615 closed 2 years ago

smb20200615 commented 2 years ago

Hello,

I am performing a GWAS (mixed effects) and correcting for population structure using a phylogenetic tree and your script -- python tools/pyseer/scripts/phylogeny_distance.py --lmm tree > sim.txt. I still see horizontal lines (similar to your tutorial https://pyseer.readthedocs.io/en/master/tutorial.html) in my manhattan plot after correcting for population structure. Do you have an idea why this could be the case and how I can diagnose the issue/correct it? I am not sure if the issue is with my tree.

Also, apologies for iterating on a previous question. I have been providing lineage clusters to pyseer manually but want to compare to what I find using the defauly: MDS components. However, I am not able to tell which MDS component in the lineage filematches what genome. Is that information provided somewhere by pyseer?

Thank you so much!

johnlees commented 2 years ago

I am performing a GWAS (mixed effects) and correcting for population structure using a phylogenetic tree and your script -- python tools/pyseer/scripts/phylogeny_distance.py --lmm tree > sim.txt. I still see horizontal lines (similar to your tutorial https://pyseer.readthedocs.io/en/master/tutorial.html) in my manhattan plot after correcting for population structure. Do you have an idea why this could be the case and how I can diagnose the issue/correct it? I am not sure if the issue is with my tree.

I'm sorry, but we can't really provide advice on specific cases like this. Have a look at the QQ plot, also see if your phenotype is strongly correlated with specific clades.

Also, apologies for iterating on a previous question. I have been providing lineage clusters to pyseer manually but want to compare to what I find using the defauly: MDS components. However, I am not able to tell which MDS component in the lineage filematches what genome. Is that information provided somewhere by pyseer?

MDS components do not have a one-to-one mapping to genomes, they are continuous.