Closed HSA191109 closed 4 years ago
Is all this data for an 8kb product? In that case you have extremely high coverage. Take a look at the readSamplingCoverage and readSamplingBias options to downsample your data. You probably only want 100x at most.
Yes, all the data is for an 8kb product. It is a first test run with this type of sample before trying to multiplex them. Yesterday, I repeated the -correct step with readSamplingCoverage=100 readSamplingBias=1.5 purgeOverlaps=aggressive. It finished very fast without running out of memory, and currently -trim-assemble is in progress with the same parameters. I was not sure whether to still use genomeSize=4m or better genomeSize=8k, so I stayed with genomeSize=4m for the first trial, I hope it works. Thank you very much for the advice.
@HSA191109 I would think your genomeSize should be 8k since that is the size of your amplicon.
Genome size really only affects the amount of corrected reads generated. Canu will generate 40 * genomeSize
bases of corrected reads.
Hi, These are our first experiments with Nanopore sequencing and we are just practicing with different types of samples from our department. I tried to assemble a PCR-generated 8 kb fragment (prepared with the LSK-109-Kit). According to FastQC the data contain over-represented sequences. Canu-v1.9 ran out of disk space in the correction step with the default canu --correct command, so I added purgeOverlaps:=aggressive and got the same error message. Same outcome with the parameter set given at https://canu.readthedocs.io/en/latest/faq.html#my-assembly-is-running-out-of-space-is-too-slow combined with purgeOverlaps:=aggressive. Same outcome with restriction to maxMemory 200 GB and maxThreads 50 combined with purgeOverlaps:=aggressive. Is there any special parameter set for assembling a short sequence with Canu or should I use another program in this case? Similar to the ONT Lambda-experiment I set the "genome size" to 4m. Thank you very much! Best regards Katta