zx0223winner / HSDFinder

a tool to predict highly similar duplicates (HSDs) in eukaryotes
MIT License
2 stars 1 forks source link

Tools for finding the gene duplications in genomes #1

Open zx0223winner opened 1 year ago

zx0223winner commented 1 year ago

Here is the enquiry email sending from a current user which might help those who have similar concerns.

Hi I'm very interested in knowing the duplications in the fish genomes.

Can I use an HSDfinder, to predict the duplications from CDS sequences ?

How accurate were the results going to be when compared with Orthofinder ?

Suggestions appreciated.

zx0223winner commented 1 year ago

Hi Great thanks for your interest. Absolutely, I can help with your question. Please find the model fish genomes we pre-processed(http://hsdfinder.com/database/hsd/10/1/; http://hsdfinder.com/database/hsd/14/1/ ) which can give you an idea how the duplicates were collected and the amount of duplicates. You are welcome to tailor the tool on your own fish genome.

We don’t suggest using CDS sequence to identify duplicates in HSDFinder, this is same to Othofinder using protein sequence as the input file, which because using amino acid sequences can allow more matched sequences.

Orthofinder is not specially designed for detecting duplicates which is relied on tree matrix (tree topology based). However, HSDFinder allows users flexibly explore the gene duplicates at different cut-off levels (based on sequence similarity).

Please find our related publications for more details and we are also glad to collaborate. Hope this message helps.

Xi Zhang, Yining. Hu, David Roy Smith (2022). HSDatabase – a database of highly similar duplicate genes from plants, animals, and algae. bioRxiv 2022.08.01.502183; doi: https://doi.org/10.1101/2022.08.01.502183 Xi Zhang, Yining Hu, David Roy Smith. (2021). HSDFinder: a BLAST-based strategy to search for highly similar duplicated genes in eukaryotic genomes. Frontiers in Bioinformatics. doi: 10.3389/fbinf.2021.803176 Xi Zhang, Yining Hu, David Roy Smith. (2021). Protocol for HSDFinder: Identifying, annotating, categorizing, and visualizing duplicated genes in eukaryotic genomes DOI: https://doi.org/10.1016/j.xpro.2021.100619 Xi Zhang, et.al. David Roy Smith (2021). Draft genome sequence of the Antarctic green alga Chlamydomonas sp. UWO241 DOI:https://doi.org/10.1016/j.isci.2021.102084

~Xi