Closed anandksrao closed 2 years ago
Hi @anandksrao
apart from the manual (http://eead-csic-compbio.github.io/get_homologues/manual-est) and the tutorial (http://eead-csic-compbio.github.io/get_homologues/tutorial/pangenome_tutorial.html), we worked last year on an updated step by step protocol for plants. It is about to be published, but you can already use it at
http://eead-csic-compbio.github.io/get_homologues/plant_pangenome/protocol.html
Please use to that as a guide, your question relates to point 3.3.5. Let me know if anything is not clear.
We have used it at the species level (barley and Arabidopsis thaliana here) and at the genus level with outgroups from other genus (Brachypodium here and here). However, as BLASTN is used to compute homologous sequences based on nucleotide alignments, it should not be used for long taxonomic distances, as nucleotide distances saturate and BLASTN megablast hardly goes below 70% sequence identity. In that case protein sequences are more adequate (as done by standard GET_HOMOLOGUES), but even in that case you probably want phylogeny-based orthology calls to carry out analyses at the Kingdom level.
Hope this helps, Bruno
Greetings!
Can your GET_HOMOLOGUES-EST (plants) be used directly for OR simply adapted for the additional purpose of reporting lineage specific genes, not just pan genome analyses?
And would the ability to perform such analyses extend regardless of taxonomic level, i.e
If yes, would there however be any non-obvious caveats to performing such analyses and/or interpreting their results?
Thank you in advance.