yongzhiyang2012 / WGDdetector

21 stars 10 forks source link

CDS corresponding with protein! #4

Open sunnycqcn opened 4 years ago

sunnycqcn commented 4 years ago

Hello, I find in example file, the name is not corresponding with the protein. I am a newer. I did not understand the relationship between CDS and protein. Was the protein obtained by translated CDS? If I want to compare the WGD relusts of a few specices, can I combine all protein or analyze one by one? Thanks, Fuyou

yongzhiyang2012 commented 4 years ago

Hi,

The scripts will skip the different ids within CDS and proteins, and only the consisted sequences will be used in the following analysis. In the total pipeline, you should supply the CDS and protein files. If you only exist the CDS file, you can translate them by scripts or some only websites (for example: https://web.expasy.org/translate/ ; https://www.ebi.ac.uk/Tools/st/emboss_transeq/). WGDdetector is designed for detecting the WGDs within one species. Because of the speciation or other events will influence the interpretation of the results, the best way is searching them one by one when you facing multiple species.

sunnycqcn commented 4 years ago

Thanks, Fuyou

sunnycqcn commented 4 years ago

Hello, I got the result as following figure. I am not sure if it is good? I am much appreciated for your help. Fuyou final.ks.distribution.list.pdf

yongzhiyang2012 commented 4 years ago

The result seems correctly, and the ancient duplicated peak is very markable (ks: ~2). The peak around zero may reflect the tandem duplication or assembly errors, such like redundant haplotigs.

sunnycqcn commented 4 years ago

Hello, I am much appreciated for your suggestions. I have understand some. I compared wgd of 160 isolates fungal genome. They are almost same pattern. But the peak values are different. Some are about 2, some are greater than 2, some less than 2. Can I say the whole genome duplication adaptive the fungal genome evolution? Or can I get other information? Thanks, Fuyou

yongzhiyang2012 commented 4 years ago

You need more information to identify the WGD events are the same one or indipendent occured within each species. You should try methods based on tree topology (phylogeny method) or Ks values correction (beacause different species contained different mutation rate)

sunnycqcn commented 4 years ago

I have finished the tree analysis with orthofinder. I will try to use ks correction. Thanks Fuyou