cjneely10 / EukMetaSanity

Structural and functional annotation of eukaryotic metagenome-assembled genomes
GNU General Public License v3.0
21 stars 1 forks source link

EukMetaSanity on unbinned contigs #16

Closed mgabriell1 closed 2 years ago

mgabriell1 commented 2 years ago

Hi, Thanks for developing this very comprehensive pipeline! Is it possible to use it to annotate directly assembled contigs rather the bins? For sure evaluating the quality of the genomes using Busco would need to be skipped, as there would be no genome bin involved, but I'm wondering if the other steps require this as well. I'm asking this as even if I managed to recover eukaryotic contigs from a mixed-community metagenome I guess the sequencing or the sampling procedure did not allow much of eukaryotic recovery and when I try to perform binning the resulting ones have very short lengths (< 2 Mbp).. Thanks!

cjneely10 commented 2 years ago

Hi @mgabriell1,

Thank you for your interest in the project!

While I have not directly attempted this kind of analysis with EukMetaSanity, my inclination is that certain programs in the pipeline may perform poorly. These programs expect to work with a single "organism," and they attempt to infer information (about intron/exon boundaries, etc.) based on this assumption. However, parts of the pipeline involve protein database searches, and, likely, these would not be negatively impacted.

If at all possible, I would attempt to bin the contigs prior to running EukMetaSanity. If they are not binned, I would suggest using the "MetaEukEV" output that is generated by the Run pipeline instead of the other outputs. You may also choose to set the "AbinitioAugustus" and "AbinitioGeneMark" config sections' skip values from false to true.

Have a great day!

Chris

mgabriell1 commented 2 years ago

Yeah, that's similar to what I thought.. Thank you so much for this reply and your suggestions!

Marco