isovic / racon

Ultrafast consensus module for raw de novo genome assembly of long uncorrected reads. http://genome.cshlp.org/content/early/2017/01/18/gr.214270.116 Note: This was the original repository which will no longer be officially maintained. Please use the new official repository here:
https://github.com/lbcb-sci/racon
MIT License
271 stars 49 forks source link

ERROR: Reads are not specified in a format which contains quality information. - only for SAM input. #43

Closed bsipos closed 6 years ago

bsipos commented 6 years ago

When I run racon as racon --sam -t 40 --use-contig-qv --bq 5 on this dataset I get the error:

[15:41:00 main] Using SAM for input alignments. (...)
[15:41:00 main] Parsing the SAM file.
ERROR: Reads are not specified in a format which contains quality information. Exiting.

If I use PAF input then racon runs successfully.

rvaser commented 6 years ago

Hello, the error indicates that some of the sequences in the SAM file do not have their qualities. Is something missing in your SAM file you posted on gist (it ends abruptly)?

Best regards, Robert

bsipos commented 6 years ago

Hi,

The SAM file is not truncated, it converts to BAM cleanly. But it does contain secondary and supplementary alignments. I have noticed that minimap2 sometimes omits the sequence field for those. Maybe that is the issue?

Best, Botond

isovic commented 6 years ago

Hi Botond,

I have noticed that minimap2 sometimes omits the sequence field for those.

In those cases it does not generate either seq nor qual fields? It does makes sense to generate such an output to save space, since all the info should be in the primary alignments. Unfortunately, we did not anticipate for that and this is the likely cause of the issue you are having, because Racon currently expects that each SAM entry has both of those fields defined (each SAM alignment is viewed as a separate object). We will address this in the next release or soon, but for now I would recommend filtering out those secondary alignments which do not have the seq and qual fields.

Best regards, Ivan.

bsipos commented 6 years ago

Thanks Ivan!

Yes, it seems that minimap2 does omit the seq and qual fields for some non-primary alignments which explains the error. I will either do filtering or stick to the PAF input until the issue is fixed.

Cheers, Botond

isovic commented 6 years ago

Np, I'll leave this open to keep track.

Ivan

rvaser commented 6 years ago

Racon 1.0.0 now supports minimap2 SAM output.