isovic / racon

Ultrafast consensus module for raw de novo genome assembly of long uncorrected reads. http://genome.cshlp.org/content/early/2017/01/18/gr.214270.116 Note: This was the original repository which will no longer be officially maintained. Please use the new official repository here:
https://github.com/lbcb-sci/racon
MIT License
269 stars 49 forks source link

[bioparser::FastqParser] error: too small chunk size! #157

Closed cbirbes closed 3 years ago

cbirbes commented 4 years ago

Hi, I'm using Racon to polish bovine genome and i'm actually trying to find the "optimal" set of long reads to have the best time / quality ratio of polishing. For this i reduced my inital dataset by removing reads <10Kb and I encountered the following error : [bioparser::FastqParser] error: too small chunk size!

I tried to find where the problem could come from but nothing was really usefull.

Polishing with the initial dataset (with reads <10Kb) worked well. Does Racon need shorts reads to work ?

rvaser commented 4 years ago

Hello, Racon does not care about the read length. The error you encountered is in the parser. Can you tell me which version of Racon you are using?

Best regards, Robert

cbirbes commented 4 years ago

Thanks for your answer, i'm using the version 1.3.1

rvaser commented 4 years ago

Can you please update to the latest version? This should fix your problem.

cbirbes commented 4 years ago

i tried with the version 1.4.3 and still got the same problem. My command are there : module load bioinfo/racon-v1.4.3 racon -t 32 longreads.rename.fq.gz map.polisher.sam offspring_raw.cgt.fa > assembly.racon1.fa

And the command error : [racon::Polisher::initialize] loaded target sequences 15.813169 s terminate called after throwing an instance of 'std::invalid_argument' what(): [bioparser::FastqParser] error: too small chunk size! /work2/project/seqoccin/assemblies/polishing_1/bos_taurus/nfPolishing_trio1_mother_37165_Filtered_reads/work/75/2164e959c26c27ebfc3f0d01f1b58f/.command.sh : ligne 3 : 80759 Abandon

rvaser commented 4 years ago

The newest version (1.4.10) is at https://github.com/lbcb-sci/racon. Version 1.4.3 has the old parser with the bug :/

cbirbes commented 4 years ago

I did it with Version 1.4.10 but a new problem appear :/

[racon::Polisher::initialize] loaded target sequences 13.671844 s /work2/project/seqoccin/assemblies/polishing_1/bos_taurus/nfPolishing_trio1_mother_37165_Filtered_reads/work/c2/71cfaf5aebce87804d3c7850af1e02/.command.sh : ligne 3 : 93641 Erreur de segmentation racon -t 32 longreads.rename.fq.gz map.polisher.sam offspring_raw.cgt.fa > assembly.racon1.fa

This command worked well on other datasets with old version, do you know where the problem may come from ?

rvaser commented 4 years ago

Could be anything. How did you obtain map.polisher.sam? How large are your files? Did you modify you read file? The error you got before might indicate that one of your reads is too large to process.

cbirbes commented 4 years ago

Reads = ONT reads (36G) map.polisher.sam from minimap2 : minimap2 -a -t 16 -x map-ont offspring_raw.cgt.fa longreads.rename.fq.gz > map.polisher.sam (187G)

No modification on reads

rvaser commented 4 years ago

Please run zcat longreads.rename.fq.gz | head -n 12 and please check if the reads have 4 lines or more.

cbirbes commented 4 years ago

Ohhhh you're right. There is my fault. I did a python script to filter my fastq read and forgot to print as fastq file longreads.rename.fq.gz have a failed fasta format. Thanks for your help ! I come back to you if I have a new problem after checking whole of my files !

rvaser commented 4 years ago

No problem :)