Closed nextgenusfs closed 6 years ago
Unfortunately, you updated to tip in the middle of some large-scale changes. The bug you hit is probably fixed but wouldn't be backwards compatible so you have to restart from scratch using the latest in tip or the 1.6 release.
As for the assembly itself, I think you won't get a very good assembly. The main determinants of assembly quality are read length and coverage and you don't have too much coverage and the reads are all very short (avg 1200). Canu/MHAP weren't really optimized for finding overlaps for such short reads, you've got all the downsides of nanopore data (e.g. error rate) but none of the advantages (e.g. long reads). You could try using minimap as the overlapper as it may be better for finding these short overlaps (corOverlapper=minimap
after installing minimap2 and symlinking/copying the executable to the canu bin folder) but as I said, I don't think you're going to get a very good assembly.
Thanks for the input. Yes I'm aware the data isn't the greatest, unfortunately getting high proportions of high MW DNA without also have short fragments from filamentous fungi is difficult. I'll give the minimap2 a whirl.
Inactive, and not much for us to address here. Canu isn't going to be optimized for <1kb reads.
I've been trying to run Canu on ~ 30X nanopore 1D reads for a ~ 30Mb genome for the past month, I've encountered several problems. The reads are on the shorter end, i.e. lots of the data is < 1 kb. So I've tried to run this decreasing readLength to 500. I then ran out of disk space on a previous run, the overlaps from cormhap are taking up nearly 2 TB. At any rate, I first was using canu v1.6, but since I was having some problems, I upgraded to the tip release at the time:
After 14 days, the cormhap step finished, but then it immediately errors out on the next step, which after I looked at the correction log file says that there was not enough memory allocated to the ovs step.
However, when I try to restart using more memory, then I get the following error:
I feel like perhaps there are some intermediate files that I can remove for this to restart? I don't want to compute the overlaps again if I can avoid it, considering the run time....