rrwick / Trycycler

A tool for generating consensus long-read assemblies for bacterial genomes
GNU General Public License v3.0
306 stars 28 forks source link

Zero coverage at clustering #7

Closed martinmchugh closed 3 years ago

martinmchugh commented 3 years ago

Thanks @rrwick for developing this tool, and for making the wiki so clear and easy to follow!

I'm not sure this is really a trycycler issue but wanted your thoughts. I've done long read subsampling and assembly as suggested in the wiki - 12 read sets, assembled with flye, miniasm+minipolish, raven, redbean.

When I come to the clustering step it runs fine but I notice there are 2-3 contigs in the redbean assemblies with coverage of zero. I've given trycycler the complete read set, which I previously subsampled to generate the redbean assemblies. Apart from using the wrong read sets I can't work out how I could get zero coverage.

Do you think this might be an error during the redbean assembly, or during the clustering? Or something else?

rrwick commented 3 years ago

While this is indeed a bit weird, I doubt it's a problem.

Trycycler cluster gets read depth values by aligning the reads back to the assembly, taking the single best alignment per read, and using those alignments to see how deeply covered each contig is.

So if your Redbean assemblies have contigs with a depth of zero, that means they didn't receive any alignments. I suspect they are short/redundant/junk contigs, so any reads that could have aligned to them have preferentially aligned to other contigs instead. I assume that Trycycler cluster has filtered these contigs out (based on the --min_contig_depth setting), which is probably a good thing.

I am curious why Redbean made such contigs, but it's not really a cause for concern. So don't worry about it :smile:

martinmchugh commented 3 years ago

Yeah they don't make it into in the .newick but I can see them listed in the log. This happened with two read sets for different isolates, although same species (Enterococcus faecium). Makes sense if there is a greedy approach to alignment. Carried on through Trycycler and output looks good! Thanks