arzwa / wgd

Python package and CLI for whole-genome duplication related analyses. This package is deprecated in favor of https://github.com/heche-psb/wgd.
http://wgd.readthedocs.io/en/latest/
GNU General Public License v3.0
81 stars 41 forks source link

why wgd can't run syn in one_v_one model? #36

Closed ShenChen-bioUtopia closed 4 years ago

ShenChen-bioUtopia commented 4 years ago

I want to know why wgd can't use the syn between two different genomes. Other pipelines(such as MCscan) are always run collineation first , and then calculate the Ks of collinear parologs.

arzwa commented 4 years ago

Hi, between genome co-linearity inference is currently not implemented, this should be in a future version. However note that if you do a between genome co-linearity search using the orthologs obtained with wgd mcl or wgd dmd, you can obtain the Ks distribution for the 'syntelogs' simply by taking only the gene pairs corresponding to anchors from the Ks distribution computed with wgd ksd in one-vs.-one mode.

I do not understand the second part of your query. MCScan has a very different approach I think, as it does not infer gene families before doing co-linearity searches. In wgd the same gene families (e.g. inferred using wgd mcl or wgd dmd) are used for computing the whole-paranome Ks distribution and for inferring co-linear blocks. The anchor pair Ks distribution is not constructed in the wgd syn command, but is obtained by taking the subset of the full whole-paranome Ks distribution that corresponds to co-linear paralogous pairs.