Nextomics / NextDenovo

Fast and accurate de novo assembler for long reads
GNU General Public License v3.0
350 stars 52 forks source link

nextdenovo correct pacbio subreads #75

Closed licheng0921 closed 3 years ago

licheng0921 commented 3 years ago

Hi, I tried to correct pacbio raw date with nextdenovo. I find that only ~20x reads left, it is normal? whether should I set seed cutoff shorter?

ps: the genome is ~7.4g, raw subreads = 590g(~80x) seed cutoff is calculated by seq_stat.

moold commented 3 years ago

Yes, if you need more corrected data. Try to set correction_options = -b -p 10 and remove file workdir/02.cns_align/01.seed_cns.sh.done and work/02.cns_align/01.seed_cns.sh.work/seed_cns*/nextDenovo.sh.done, and rerun your main script. It will skip corrected seeds and only correct filtered seeds.

licheng0921 commented 3 years ago

Thank you very much. I will do it as you advised.

licheng0921 commented 3 years ago

Sorry to bother you again. I checked others comments about correction_options = -b -p 10. I do not know whether my understanding right or not. If I continue assemble using nextdenovo, I have no need to add -b to rerun? If I assemble the genome using other assembler, like: wtdbg2 or flye, I should rerun the main script with correction_options = -b -p 10?

many thank!

moold commented 3 years ago

yes, but you can not use the currently released NextDenovo to do the assembly, because your genome size is over the limit.

licheng0921 commented 3 years ago

yes, I will try correct subreads using nextdenovo. thanks to remined me.