Open jflot opened 2 years ago
Unicycler chooses the 'best' assembly using contig count and dead end count, so I see why it made the choice it did. I don't, however, understand what's going on with SPAdes! I.e. why do k-mers 63+ result in such small assemblies?
If this is with the current version of Unicycler (v0.5.0), the raw SPAdes graphs should be in the output (prefixed with 001
). They might shed some light on this. My hunch is that there is something weird/wrong with this read set - contamination maybe? But I don't know!
Ryan
I am facing the same issue. Final assembly selected by Unicycler is quite small (~300K).
Is there a way to use other metrics to select the best assembly?
I don't, however, understand what's going on with SPAdes! I.e. why do k-mers 63+ result in such small assemblies?
The answer is quite simple. Likely the input reads are only 75 bp long. So, after half of the read length the graph is just a pile of barely connected reads that are removed by the graph simplification algorithms.
Here for this assembly although the score does increase with each increase of K the choice made seems particularly poor... Maybe the way the score is calculated is not suited to this type of situation?