mgalardini / pyseer

SEER, reimplemented in python 🐍🔮
http://pyseer.readthedocs.io
Apache License 2.0
104 stars 25 forks source link

qq plot #212

Closed ShiminYang97 closed 1 year ago

ShiminYang97 commented 2 years ago

微信图片_20220625131038 微信图片_20220625131053 qq plot is so strange. I ran several copies of different data, but the output qq plot is similar to this. The scripts I use are all qq_plot.py

johnlees commented 2 years ago

Can you try making the qq plot from the p-values in something else (e.g. https://rdrr.io/bioc/GWASTools/man/qqPlot.html) Are they logged/unlogged?

mgalardini commented 2 years ago

Another user had a similar issue with the script (issue #197). I think the problem is the precision on the x-axis possibly? We could just comment this line I think https://github.com/mgalardini/pyseer/blob/master/scripts/qq_plot.py#L52

julibeg commented 1 year ago

just came across this as well (and this question). Looks like sm.qqplot_2samples takes the data to be plotted on the x-axis as first argument and the data for the y-axis as second. I don't know if this was different in the past, but in the script, y is passed before x. Could this be the issue?

julibeg commented 1 year ago

ok, looks like this is definitely an issue in the statsmodels function. With an older statsmodels version (0.12.2) I get this plot: qq_plot

With the current version (0.13.2), however, I get this: qq_plot

julibeg commented 1 year ago

Can confirm that swapping x and y indeed restored the original behaviour for me.

mgalardini commented 1 year ago

Ah thanks that makes sense. I guess we'll have to check statsmodels's version and change the behavior accordingly.

mgalardini commented 1 year ago

This seems the relevant commit, so checking for version 0.13 and below could catch this