soedinglab / metaeuk

MetaEuk - sensitive, high-throughput gene discovery and annotation for large-scale eukaryotic metagenomics
GNU General Public License v3.0
178 stars 23 forks source link

Can this be used with transcript contigs? #28

Closed jolespin closed 3 years ago

jolespin commented 3 years ago

I ran rnaSPAdes and was wondering if I could use MetaEuk for gene calls.

elileka commented 3 years ago

Hello, I am not familiar exactly with the specific output of rnaSPAdes but in general, if there is protein-coding information in your data, MetaEuk should be able to detect it given that there are homologs in the target database. Please note that if your data is of processed RNA, it can be that an originally multiexon gene appears as a "single exon". For the same reason, be careful when interpreting frame and strand. In the MetaEuk header all positions will be with respect to the contig, so in your case, RNA.

jolespin commented 3 years ago

If you are running this on multiple MAGs, does it make any difference in the predictions if all MAGs are concatenated into a single fasta file or should they be kept separate?

elileka commented 3 years ago

You can provide your sequences in a single file. MetaEuk treats each sequence in the Fasta input as a single contig (so in a very perfect world, as a whole genome). It thus does not reduce redundancy between contigs or anything like that.