Open christianbrinch opened 1 year ago
This is strange. Does the problem reproduce with restart? And w/o --trusted-contigs
option?
It persists with restart but goes away without the --trusted-contigs option.
Ok. Will it be possible for you to share the data so we can reproduce and fix issue?
Unfortunately I am not able to share the data and I understand that it makes I difficult to resolve the issue. I can, however, explain what I have done. I have a large set (100+) of metagenomic samples, consisting of miseq, nextseq, and novaseq reads, from which a MAG has been assembled in a large metagenomic co-assembly. The MAG is about 60% complete and I am trying to extract the full genome. My strategy is to align all my samples against the MAG, take the reads out that align, de novo assemble those reads with the MAG as trusted contigs, curate the resulting scaffolds and use them as my new MAG. Then I repeat the cycle. After five such iterations, I have reached about 75% completeness, and it works quite well: the contigs grow with a few hundred bases per iteration as expected. However, at the 6th iteration, Spades all of a sudden creates these contigs with non-ascii characters in them.
Because the error goes away when I drop the --trusted-contig option, it must be caused by the set of contigs I use for that. I curate my contigs using Geneious, so maybe it outputs something Spades doesn't like? I can't find any non-standard characters in those fast files though, but I will investigate the issue a bit further myself.
Description of bug
After de novo assembly of paired end Illumina reads, Spades output contigs/scaffolds that contain non-ascii characters (). The input fast files do not contain these characters.
Example
spades.log
spades.log.zip
params.txt
params.txt.zip
SPAdes version
SPAdes/3.15.5
Operating System
Ubuntu/centOS
Python Version
python3.9
Method of SPAdes installation
manual
No errors reported in spades.log