How to make this pipeline identify the duplicate gene modes of autotetraploid genome?

qiao-xin / DupGen_finder

A pipeline used to identify different modes of duplicated gene pairs

92 stars 29 forks source link

How to make this pipeline identify the duplicate gene modes of autotetraploid genome? #26

Open wanan8832760 opened 2 weeks ago

wanan8832760 commented 2 weeks ago

Hi, How to make this pipeline identify the duplicate gene modes of autotetraploid genome? Should I use haplotype-resolved genome or whole genome? Can you answer that for me? Thank you so much!

qiao-xin commented 2 weeks ago

Please use the haplotype-resolved genome.

wanan8832760 commented 2 weeks ago

Please use the haplotype-resolved genome.

Dear Dr.Qiao, Thanks for your quick reply. I wonder if I have only one autotetraploid species and want to find the duplicate gene modes using itself, can I use its haplotype-resolved genome as a target and outgroup species? Thanks for your reply, and have a nice day.

qiao-xin commented 2 weeks ago

The selection of outgroup species will have an influence on the identification of transposed gene pairs. You can use the target species as outgroup species but choosing a closely related species as outgroup species will be better. Please refer to https://github.com/qiao-xin/DupGen_finder/issues/5

wanan8832760 commented 2 weeks ago

The selection of outgroup species will have an influence on the identification of transposed gene pairs. You can use the target species as outgroup species but choosing a closely related species as outgroup species will be better. Please refer to #5

Ok，I will try it .Thank you very much for your quick reply！

wanan8832760 commented 2 weeks ago

Please use the haplotype-resolved genome. Dear Dr.Qiao, I followed your suggestion regarding autotetraploids, which worked well overall. However, I encountered numerous duplicate results in the all results file (.pairs-unique). For my subsequent analysis—including assessing shared and specific gene duplications among significantly expanded genes (SEGs) and the five duplication categories using Venn diagrams, as well as calculating the Ka/Ks ratios for these categories—should I manually delete the duplicate results? If so, should the data in the .pairs.stats-unique file also be modified? Your guidance on this would be greatly appreciated. Thank you so much!

qiao-xin commented 2 weeks ago

I am not clear about your downstream analyses but I don't think that the duplicated gene IDs should be removed because they are involved in different gene pairs.