arzwa / wgd

Python package and CLI for whole-genome duplication related analyses. This package is deprecated in favor of https://github.com/heche-psb/wgd.
http://wgd.readthedocs.io/en/latest/
GNU General Public License v3.0
81 stars 41 forks source link

How to draw multiple pairs Ks plot after KDE Fit? #26

Closed liufuyan2016 closed 4 years ago

liufuyan2016 commented 4 years ago

Dear professor, How to draw multiple pairs Ks plot after KDE Fit? The wgd kde only accepted one file. Thank you !

arzwa commented 4 years ago

I'm flattered, but not a professor ;) It seems I didn't implement that feature in the wgd kde command, but you can do that with the interactive plot utilities in wgd viz. From a terminal run a bokeh server instance and use wgd viz as follows (but replace with the correct filepaths of course)

bokeh serve &  
wgd viz -i -ks ath-ptr.ks.tsv,ptr.ks.tsv -l "Arabidopsis - Populus,Populus"

This shoud open up a browser where you can manipulate the plots, among others showing multiple KDEs. It should look somewhat like this:

Screenshot from 2020-02-16 16-32-07

liufuyan2016 commented 4 years ago

I run in the center OS systerm. There is some command to save the picture not interactive. I use the command and an error is occured :

 warnings.warn(_LEGEND_EMPTY_WARNING % attr)
BokehDeprecationWarning: 'legend' keyword is deprecated, use explicit 'legend_label', 'legend_field', or 'legend_group' keywords instead
BokehDeprecationWarning: 'legend' keyword is deprecated, use explicit 'legend_label', 'legend_field', or 'legend_group' keywords instead
2020-02-22 11:45:23,762 WebSocket connection opened
2020-02-22 11:45:23,763 ServerConnection created
BokehDeprecationWarning: ClientSession.loop_until_closed is deprecated, and will be removed in an eventual 2.0 release. Run Bokeh applications directly on a Bokeh server instead. See:

    https//docs.bokeh.org/en/latest/docs/user_guide/server.html
arzwa commented 4 years ago

This is the output of what command? I don't see an error, just warnings, so I don't think there are problems here. Again, to run the bokeh visualization module you should do the following:

  1. Run bokeh serve & This will start a Bokeh server in the background. This might print some things to the terminal (like the above warnings I think). In that case just hit <ENTER> to get back to the terminal.
  2. With the Bokeh server running. execute the wgd viz -i command with appropriate arguments. (see my example above)

Note that if you have some Python or R skills, you can always try loading the Ks distributions in R or Python and plot them yourself.

liufuyan2016 commented 4 years ago

Thanks you quickly apply~ I plant to use R. How to output the fitting data used for plot in wgd kde package?

arzwa commented 4 years ago

I'm not an R user, but you can fit a KDE to a Ks distribution as follows (see also #24 ):

# read in data
df = read.csv("your_distribution.tsv", sep="\t")

# filter Ks distribution (0.001 < Ks < 5) 
lower_bound = 0.001
upper_bound = 5
df = df[df$Ks < upper_bound,]
df = df[df$Ks > lower_bound,]

# perform node-averaging (redo when applying other filters)
dff = aggregate(df$Ks, list(df$Family, df$Node), mean)

# reflect the data around the lower Ks bound to account for boundary effects
ks = c(dff$x, -dff$x + lower_bound)

# plot a histogram and KDE on top
hist(ks, prob=TRUE, xlim=c(0, upper_bound), n=50)
lines(density(ks), xlim=c(0, upper_bound))

This should be easy to adapt for multiple Ks dsitributions.