Closed songmj86 closed 5 months ago
Good question. You could take a average of the scores weighed by their length.
Alternatively, you can concatenate them before classification, adding a couple of Ns between each contig (to prevent gene calls extending across different contigs). This would allow geNomad to leverage the information of all contigs when performing classification.
If one of those contigs/scaffolds are annotated as plasmid (for example having plasmid score > 0.9) using Genomad, can I regard those contigs/scaffolds as the plasmid not chromosme ?
To answer this quickly, no. Plasmid contigs may bin with chromosomal sequences, thus it is not advisable to assume that the entire bin represents plasmid content.
Thanks for your quick response
If I choose the alternative option, I need to do the following steps as an example.
Step 1. Prepare the concantenated contigs
Contig1 ATCCGCATC ... ATCCGCATC Contig2 CTGACGTAC ... CTGACGTAC
Step 2. Attach five Ns at the end of each contig
Contig1 NNNNNATCCGCATC ... ATCCGCATCNNNNN Contig2 NNNNNCTGACGTAC ... CTGACGTACNNNNN
Step 3. Run Genomad
Did I understand correctly??
I sincerely appreciate your help !
You need to concatenate the contigs into a single sequence, so that geNomad will process the whole thing as one entity. Like this:
>seq
<contig 1 sequence>NNNNNNNNNN<contig 2 sequence>NNNNNNNNNN<contig 3 sequence>...
Hi I have another question to ask !
Bacteria MAGs mostly are compirsed of multiple fragmented contigs/scaffolds.
If one of those contigs/scaffolds are annotated as plasmid (for example having plasmid score > 0.9) using Genomad, can I regard those contigs/scaffolds as the plasmid not chromosme ?
Thank you very much !