linxingchen / cobra

A tool to raise the quality of viral genomes assembled from short-read metagenomes via resolving and joining of contigs fragmented during de novo assembly.
MIT License
62 stars 10 forks source link

Question on Input for cobra #16

Closed Hocnonsense closed 9 months ago

Hocnonsense commented 9 months ago

Thanks for this great tool! However, I have a few questions on input for cobra

  1. Now I'm using megahit to assembly (I just know from #3 that it may generate many chimeric contigs, but spades may run out of memory when assembly environmental samples). In megahit intermediate results, it will provide a file named ./intermediate_contigs/k141.contigs.fa, in which there are many small contigs < 200 bp (which will be filtered in ./final.contigs.fa). My question is, which contig file is more recommended to be used as --fasta FASTA input?
  2. Is it possible to just use the final.contigs.fa as --query QUERY file, or a filtered version with all contigs longer than given length (i.e. 1000 or 2500 bp)? In another words, can cobra be used before virus contigs annotation and MAG binning?

Regards, hwrn