Nextomics / NextDenovo

Fast and accurate de novo assembler for long reads
GNU General Public License v3.0
360 stars 53 forks source link

Question about error rate for correction #196

Open mylena-s opened 10 months ago

mylena-s commented 10 months ago

Hello! thank you for developing NextDenovo, it is working perfectly for me. I am writing to ask you a question about the parameters for the run.

I am working with ONT reads, mostly obtained using the V14 chemistry of the R10.4.1 flowcells. The median quality of the reads is 20, and there is a high proportion of ultra long reads with lengths above 100Kbp, up to 800Kbp.

I also have other datasets obtained with the previous ONT flowcells that have a higher error rate. When I compare the results of the assemblies obtained with both technologies (but similar coverages), I find higher N50s for the older flowcells (12Mb vs. 6 Mb).

I know this could be coincidental, but I was wondering if there is any parameter that I could modify to account for the lower error rate or larger reads, and therefore improve assembly?

Thank you in advance!

Mylena

moold commented 10 months ago

You could try increasing the value of -k -w in minimap2_options_raw and minimap2_options_cns, I'm not sure if that will improve the assembly, but it will definitely introduce the running time.