Closed tingyanchang closed 6 years ago
I would guess the issue is you have no corrected data, the parameter rawErrorRate=0.035
sets the raw read error overlap to 3.5%. Since the raw reads individual error is over 10%, this eliminated most of the overlaps and led to no corrected reads. You also probably don't need overlapper=mhap utgReAlign=true
, the fast option won't save that much time on PacBio data, it's primarily for useful for nanopore.
Since you only have about 30x coverage of raw data, you should probably use the sensitive parameters:
'correctedErrorRate=0.105' 'corMinCoverage=0'
especially if your data is from a Sequel instrument which has lower quality than the RSII.
The *.report
will have histograms of the raw input and corrected read lengths (and gobs more stuff). This will confirm that few reads were corrected. Also, from the output you pasted:
-- In 'asm.gkpStore', found PacBio reads:
-- Raw: 2352917
-- Corrected: 93
-- Trimmed: 0
I encourage posting results of your 1Gbp assembly. We rarely hear of success stories here.
hello
I ran the ecoli data from tutorials, it succeed. but I test assembly bird genome with 30x depth by canu v1.7. It tooks long time and fail.I have a 256G memory server and about 15T hard drive space.
Another question is that I have to assemble a genome about 1G. The genome DNA was sequenced by pacbio, and the depth is 40x. Is there any suggestion how to setting the canu parameter?
Here is my canu command
canu -p asm -d Zfinch_genome genomeSize=1223m -pacbio-raw Zfinch_merged.x\=030.000.n\=003384909.u.fastq rawErrorRate=0.035 overlapper=mhap utgReAlign=true
my output
this is precompute.000001.out content
Thanks for your help