isovic / racon

Ultrafast consensus module for raw de novo genome assembly of long uncorrected reads. http://genome.cshlp.org/content/early/2017/01/18/gr.214270.116 Note: This was the original repository which will no longer be officially maintained. Please use the new official repository here:
https://github.com/lbcb-sci/racon
MIT License
271 stars 49 forks source link

Error: Unexpected value found! Could not find qname in the input contigs file! #18

Closed pi3rrr3 closed 7 years ago

pi3rrr3 commented 7 years ago

Hello,

After having ran minimap/miniasm on ~55x of PacBio data, I used Racon and ended with the following error:

bin/racon -v 9 -t 16 filtered_subreads.fastq.gz 14smrt.paf 14smrt.gfa racon1_consensus.fa
[18:43:39 main] Using PAF for input alignments. (14smrt.paf)
[18:43:39 main] Loading reads.
[18:48:20 main] Hashing qnames.
[18:48:23 main] Parsing the overlaps file.
[18:48:23 main] Unique overlaps will be filtered on the fly.
[Wed, 08 Feb 17 17:48:23 +0000 ERROR] #5: Unexpected value found! Could not find qname 'm161209_103737_42269_c101137592550000001823249505311710_s1_p0/108844/10223_20318' in the input contigs file! Exiting.In function: 'ParsePAF'.

[Wed, 08 Feb 17 17:48:23 +0000 ERROR] #5: Unexpected value found! Could not find qname 'm161209_103737_42269_c101137592550000001823249505311710_s1_p0/108844/10223_20318' in the input contigs file! Exiting.In function: 'ParsePAF'.
Exiting.

However, if I check the .paf file:

grep "m161209_103737_42269_c101137592550000001823249505311710_s1_p0/108844/10223_20318" 14smrt.paf | head
m160911_135114_42266_c101051342550000001823235612291623_s1_p0/108988/1775_11679 9904    2584    9147    +   m161209_103737_42269_c101137592550000001823249505311710_s1_p0/108844/10223_20318    10095   3012    991909  6905    255 cm:i:107
m160911_135114_42266_c101051342550000001823235612291623_s1_p0/108988/11719_13441    1722    659 1653    -   m161209_103737_42269_c101137592550000001823249505311710_s1_p0/108844/10223_20318    10095   89610012    194 1044    255 cm:i:34
m160911_135114_42266_c101051342550000001823235612291623_s1_p0/113245/15907_27804    11897   1188    8493    -   m161209_103737_42269_c101137592550000001823249505311710_s1_p0/108844/10223_20318    10095   3038451 308 8148    255 cm:i:39
m160911_135114_42266_c101051342550000001823235612291623_s1_p0/113245/27850_39524    11674   3858    11629   +   m161209_103737_42269_c101137592550000001823249505311710_s1_p0/108844/10223_20318    10095   4919206 513 8715    255 cm:i:49
m160911_135114_42266_c101051342550000001823235612291623_s1_p0/19257/0_9033  9033    131 6625    +   m161209_103737_42269_c101137592550000001823249505311710_s1_p0/108844/10223_20318    10095   4293    10082   657 6494    255 cm:i:71
m160830_160025_42266_c101051572550000001823235612291631_s1_p0/62000/5163_20500  15337   1906    11244   -   m161209_103737_42269_c101137592550000001823249505311710_s1_p0/108844/10223_20318    10095   506 9901458 9401    255 cm:i:172
m160830_160025_42266_c101051572550000001823235612291631_s1_p0/62000/20545_37122 16577   3720    13189   +   m161209_103737_42269_c101137592550000001823249505311710_s1_p0/108844/10223_20318    10095   196 9881581 9687    255 cm:i:192
m160830_160025_42266_c101051572550000001823235612291631_s1_p0/71105/16971_25751 8780    437 4443    -   m161209_103737_42269_c101137592550000001823249505311710_s1_p0/108844/10223_20318    10095   215 408418  4006    255 cm:i:41
m160830_160025_42266_c101051572550000001823235612291631_s1_p0/71105/25795_42149 16354   4161    13950   +   m161209_103737_42269_c101137592550000001823249505311710_s1_p0/108844/10223_20318    10095   279 10069   1597    9790    255 cm:i:192
m160830_160025_42266_c101051572550000001823235612291631_s1_p0/73351/0_8663  8663    50  2451    -   m161209_103737_42269_c101137592550000001823249505311710_s1_p0/108844/10223_20318    10095   59  243422  2401    255 cm:i:58

I don't get it... Here are my minimap/miniasm commands:

minimap/minimap -x ava10k -t24 filtered_subreads.fastq.gz filtered_subreads.fastq.gz | gzip -1 > 14smrt.paf.gz
miniasm/miniasm -f filtered_subreads.fastq.gz 14smrt.paf.gz > 14smrt.gfa

Thanks for your help!

isovic commented 7 years ago

Hi, it looks like you are giving Racon the pairwise read-to-read overlaps, instead of read-to-contigs mappings, that's why the qnames cannot be found. After your assembly steps:

minimap/minimap -x ava10k -t24 filtered_subreads.fastq.gz filtered_subreads.fastq.gz | gzip -1 > 14smrt.paf.gz  
miniasm/miniasm -f filtered_subreads.fastq.gz 14smrt.paf.gz > 14smrt.gfa  

map the reads with something like:

minimap/minimap 14smrt.fasta filtered_subreads.fastq.gz > mappings.paf  
bin/racon -v 9 -t 16 filtered_subreads.fastq.gz mappings.paf 14smrt.gfa racon1_consensus.fa  

You will need to convert the assembly .gfa to .fasta format, you can simply awk:

awk '$1 ~/S/ {print ">"$2"\n"$3}' 14smrt.gfa > 14smrt.fasta  

Hope this helps! Best regards, Ivan.

pi3rrr3 commented 7 years ago

Hello Ivan,

Thanks for the detailed answer, I got confused with the different .paf outputs... Racon now works perfectly fine!

Best, Pierre

mortunco commented 6 years ago

Hello,

I am getting the same error but my aim is not "pair vs pair mapping". I have contigs from canu and I would like to map my raw reads to canu's contigs.

(Again) in this pipeline, only minimap step is done but when I do that I get the following error. https://github.com/nanoporetech/ont-assembly-polish/blob/master/analysis.mk

Error

Could not find qname 'some' in the input contigs file! Exiting.In function: 'ParsePAF'.
Exiting.

Here is my command line option.

minimap ~/kefal_genome_canu_lowcov/kefal-low.contigs.fasta ~/kefal_genome_miniasm/kefal_pacbio_bigfile_fastq/kefal_pacbio_bigfile.fastq > minimap_overlaps.paf

Am I missing the miniasm step ? Becasuse I dont do miniasm.

I would be more than glad if you can help me out with this problem (again).

Best regards,

Tunc.

rvaser commented 6 years ago

Hi Tunc, can you please paste your racon command here?

Best regards, Robert

mortunco commented 6 years ago

Rvaser

I had two canu runs for a single reads. One regular and one with low coverage options. I accidentally mixed up paths. I think thats why get the error.

I fixed them and my racon worked very well. (well it finished without any error :D)

I am sorry. The mistake on my side.

Best regards Tunc.