isovic / racon

Ultrafast consensus module for raw de novo genome assembly of long uncorrected reads. http://genome.cshlp.org/content/early/2017/01/18/gr.214270.116 Note: This was the original repository which will no longer be officially maintained. Please use the new official repository here:
https://github.com/lbcb-sci/racon
MIT License
271 stars 49 forks source link

racon #44

Closed mortunco closed 6 years ago

mortunco commented 6 years ago

Hi, I am trying to manually implement to this pipeline. https://github.com/nanoporetech/ont-assembly-polish/blob/master/analysis.mk

So I follow, 1) canu assembly with my pacbio reads 2) map those pacbio reads to contigs (from canu) with minimap. 3) polish them with racon.

In the third step, I am having a problem with inputting paf file generated from minimap. I dont see why I am having this problem because when I compre this format with the format which worked fine with racon are same.

This is the racon command line options that I use.

racon -t 16 -v 9 ~/kefal_genome_miniasm/kefal_pacbio_bigfile_fastq  ~/asemmbly_polish_with_illumina/manuelRun/minimap_1/minimap_overlaps.paf ./kefal.contigs.fa racon_correct_reads.fasta

This is the error that I get.

tmorova@lisa:~/asemmbly_polish_with_illumina/manuelRun/racon_1$ sh step2.sh
[17:51:47 main] Using PAF for input alignments. (/home/tmorova/asemmbly_polish_with_illumina/manuelRun/minimap_1/minimap_overlaps.paf)
[17:51:47 main] Loading reads.
[Mon, 18 Dec 17 14:51:47 +0000 FATAL] #5: Unexpected value found! Input sequence file format unknown!
 In function: 'LoadSeqs_'.
tmorova@lisa:~/asemmbly_polish_with_illumina/manuelRun/minimap_1$ head minimap_overlaps.paf
m161115_082159_42163R_c101134722550000001823248606131762_s1_p0/54494/31178_33504        2326    1281    2295    +       tig00019057     21572   9601    10513   162     1014    255     cm:i:17
m161115_082159_42163R_c101134722550000001823248606131762_s1_p0/54494/33550_41136        7586    22      7432    -       tig00019057     21572   3075    10519   1895    7444    255     cm:i:209
m161115_082159_42163R_c101134722550000001823248606131762_s1_p0/54494/33550_41136        7586    3817    5093    -       tig00014190     65533   24660   25896   301     1276    255     cm:i:31
m161115_082159_42163R_c101134722550000001823248606131762_s1_p0/54494/33550_41136        7586    3817    5093    +       tig00021301     57956   29305   30561   295     1276    255     cm:i:27
m161115_082159_42163R_c101134722550000001823248606131762_s1_p0/54494/33550_41136        7586    3965    5101    +       tig00004520     176768  89077   90177   257     1136    255     cm:i:26
m161115_082159_42163R_c101134722550000001823248606131762_s1_p0/54494/33550_41136        7586    3817    5101    +       tig00058750     42190   33242   34469   255     1284    255     cm:i:26
m161115_082159_42163R_c101134722550000001823248606131762_s1_p0/54494/33550_41136        7586    3817    5101    -       tig00563576     282163  35986   37227   247     1284    255     cm:i:25
m161115_082159_42163R_c101134722550000001823248606131762_s1_p0/54494/33550_41136        7586    3817    5101    +       tig00034468     19268   9739    10987   245     1284    255     cm:i:25
m161115_082159_42163R_c101134722550000001823248606131762_s1_p0/54494/33550_41136        7586    3817    5101    +       tig00007629     87413   52356   53618   262     1284    255     cm:i:25
m161115_082159_42163R_c101134722550000001823248606131762_s1_p0/54494/33550_41136        7586    3817    5093    +       tig00019376     45335   34915   36486   234     1571    255     cm:i:24
tmorova@lisa:~/asemmbly_polish_with_illumina/manuelRun/minimap_1$ tail minimap_overlaps.paf
m160720_100353_42163R_c101033082550000001823255011171637_s1_p0/108987/0_12784   12784   10182   10278   +       tig00000810     192408  22289   22384   51      96      255     cm:i:4
m160720_100353_42163R_c101033082550000001823255011171637_s1_p0/108987/0_12784   12784   1782    1832    -       tig00061924     82566   39570   39619   42      50      255     cm:i:4
m160720_100353_42163R_c101033082550000001823255011171637_s1_p0/108987/0_12784   12784   1351    2207    -       tig00020221     54784   13520   14369   41      856     255     cm:i:4
m160720_100353_42163R_c101033082550000001823255011171637_s1_p0/108987/0_12784   12784   9615    9812    -       tig00562182     10189   9654    9855    47      201     255     cm:i:4
m160720_100353_42163R_c101033082550000001823255011171637_s1_p0/108987/0_12784   12784   10182   10278   +       tig00000658     318007  251916  252011  51      96      255     cm:i:4
m160720_100353_42163R_c101033082550000001823255011171637_s1_p0/108987/0_12784   12784   4601    5789    -       tig00004438     118089  9897    10939   46      1188    255     cm:i:4
m160720_100353_42163R_c101033082550000001823255011171637_s1_p0/108987/0_12784   12784   10182   10278   -       tig00004224     72862   41597   41692   51      96      255     cm:i:4
m160720_100353_42163R_c101033082550000001823255011171637_s1_p0/108987/0_12784   12784   10182   10278   -       tig00058734     74269   56652   56746   51      96      255     cm:i:4
m160720_100353_42163R_c101033082550000001823255011171637_s1_p0/108987/0_12784   12784   10182   10278   +       tig00000296     80234   20586   20681   51      96      255     cm:i:4
m160720_100353_42163R_c101033082550000001823255011171637_s1_p0/108987/0_12784   12784   10182   10278   -       tig00560706     30583   7472    7567    51      96      255     cm:i:4

I suspect that the way that I ran canu could be a problem. In canu, instead of giving a single big fastq, I gave the file directory with *.fastq so it combined them automatically in the system. Could this be the reason of the difference ? should I run my canu with a single big fastq file input ?

I would be more than happy if you could help me out with my problem,

Best regards,

tunc.

rvaser commented 6 years ago

Hello Tunc, the extension of your reads file is invalid (the parser in racon gets the format from the extension). Please change it from kefal_pacbio_bigfile_fastq to kefal_pacbio_bigfile.fastq.

Best regards, Robert

mortunco commented 6 years ago

I am very sorry about my stupid mistake. it is an idiot pathing error.

Thank you very much for your help tho, I almost lost my mind.

Best regards,

Tunc.

rvaser commented 6 years ago

Haha, happens :)

Best regards, Robert