marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
660 stars 179 forks source link

Canu failing at unitigging step (segmentation fault) - "failed to read estimated mer threshold" #795

Closed kushalsuryamohan closed 6 years ago

kushalsuryamohan commented 6 years ago

Hi, I've spent quite some time in trying to find the right parameters to run Canu on LSF. I thought I finally had all the parameters (such as mhapThreads, corMemory, etc) such that Canu would run (had a lot of Java heap space issues that I figured out by pouring over these threads). It finally got past the correction and trimming stages but now I get this segmentation fault error (See below) This is my canu command: /gne/research/data/dnaseq/analysis/aplle/software/canu-1.5/Linux-amd64/bin/canu mhapThreads=8 mhapMemory=15 stopOnReadQuality=0 genomeSize=1.3g correctedErrorRate=0.105 minReadLength=1500 minOverlapLength=500 -pacbio-raw Pacbio_filtered_subreads_021918.fastq -nanopore-raw R4323.fastq -nanopore-raw R4324.fastq -nanopore-raw /gnet/is3/research/data/dnaseq/processed_runs/R4350/R4350.fastq -nanopore-raw R4360.fastq -nanopore-raw R4361.fastq -nanopore-raw R4362.fastq -nanopore-raw R4363.fastq -nanopore-raw R4433.fastq -nanopore-raw R4434.fastq -nanopore-raw R4435.fastq -nanopore-raw R4436.fastq -nanopore-raw R4453.fastq -nanopore-raw R4454.fastq -nanopore-raw R4455.fastq -nanopore-raw R4456.fastq -nanopore-raw 4472.fastq -nanopore-raw R4473.fastq -nanopore-raw R4474.fastq

This is on Canu release v1.5

This is the output of canu.out

-- Canu release v1.5 -- Detected Java(TM) Runtime Environment '1.8.0_101' (from '/gne/research/apps/java/jdk1.8.0_101/bin/java'). -- Detected gnuplot version '4.2 patchlevel 6 ' (from 'gnuplot') and image format 'png'. -- Detected 32 CPUs and 126 gigabytes of memory. -- Detected LSF with 'bsub' binary in /opt/lsf/9.1/linux2.6-glibc2.3-x86_64/bin/bsub. -- -- Found 162 hosts with 16 cores and 128 GB memory under LSF control. -- Found 2 hosts with 4 cores and 20 GB memory under LSF control. -- Found 3 hosts with 80 cores and 1006 GB memory under LSF control. -- Found 4 hosts with 20 cores and 128 GB memory under LSF control. -- Found 88 hosts with 20 cores and 256 GB memory under LSF control. -- Found 8 hosts with 28 cores and 128 GB memory under LSF control. -- Found 64 hosts with 16 cores and 256 GB memory under LSF control. -- -- On LSF detected memory is requested in GB --

-- Run under grid control using 128 GB and 16 CPUs for stage 'meryl'. -- Run under grid control using 15 GB and 8 CPUs for stage 'mhap (cor)'. -- Run under grid control using 12 GB and 4 CPUs for stage 'overlapper (obt)'. -- Run under grid control using 12 GB and 4 CPUs for stage 'overlapper (utg)'. -- Run under grid control using 18 GB and 4 CPUs for stage 'falcon_sense'. -- Run under grid control using 4 GB and 1 CPU for stage 'ovStore bucketizer'. -- Run under grid control using 32 GB and 1 CPU for stage 'ovStore sorting'. -- Run under grid control using 8 GB and 4 CPUs for stage 'read error detection'. -- Run under grid control using 2 GB and 1 CPU for stage 'overlap error adjustment'. -- Run under grid control using 32 GB and 4 CPUs for stage 'bogart'. -- Run under grid control using 8 GB and 4 CPUs for stage 'GFA alignment and processing'. -- Run under grid control using 64 GB and 8 CPUs for stage 'consensus'.

-- Generating assembly 'pacbio_ont_canu_usegrid_mhapthreads8' in '/gnet/is3/research/data/dnaseq/analysis/suryamok/pacbio_ont_canu_usegrid_mhapthreads8'

-- Parameters:

-- genomeSize 1300000000

-- Overlap Generation Limits: -- corOvlErrorRate 0.3200 ( 32.00%) -- obtOvlErrorRate 0.1050 ( 10.50%) -- utgOvlErrorRate 0.1050 ( 10.50%)

-- Overlap Processing Limits: -- corErrorRate 0.5000 ( 50.00%) -- obtErrorRate 0.1050 ( 10.50%) -- utgErrorRate 0.1050 ( 10.50%) -- cnsErrorRate 0.1050 ( 10.50%)

-- -- BEGIN ASSEMBLY

-- Meryl finished successfully. -- Finished stage 'merylCheck', reset canuIteration. Use of uninitialized value $error[0] in join or string at /gnet/is2/p01/home/suryamok/.linuxbrew/Cellar/perl/5.26.1/lib/perl5/5.26.1/Carp.pm line 460.

Please panic. Canu failed, and it shouldn't have.

Stack trace:

at /gnet/is3/research/data/dnaseq/analysis/aplle/software/canu-1.5/Linux-amd64/bin/lib/canu/Meryl.pm line 576. canu::Meryl::merylProcess("pacbio_ont_canu_usegrid_mhapthreads8", "utg") called at /gnet/is3/research/data/dnaseq/analysis/aplle/software/canu-1.5/Linux-amd64/bin/canu line 601

Canu release v1.5 failed with: failed to read estimated mer threshold from 'unitigging/0-mercounts/pacbio_ont_canu_usegrid_mhapthreads8.ms22.estMerThresh.out'


Sender: LSF System lsfadmin@rescomp1057 Subject: Job 431013: in cluster Exited

Job was submitted from host by user in cluster . Job was executed on host(s) , in queue , as user in cluster . </gne/home/suryamok> was used as the home directory. </gnet/is3/research/data/dnaseq/analysis/suryamok/pacbio_ont_canu_usegrid_mhapthreads8> was used as the working directory. Started at Mon Feb 26 10:11:27 2018 Results reported on Mon Feb 26 10:11:33 2018

Your job looked like:


LSBATCH: User input

canu-scripts/canu.12.sh

Exited with exit code 1.

Resource usage summary:

CPU time :               0.91 sec.
Total Requested Memory : 5.00 GB
Delta Memory :           -
(Delta: the difference between Total Requested Memory and Max Memory.)

The output (if any) is above this job summary.

This is the output of pacbio_ont_canu_usegrid_mhapthreads8.ms22.estMerThresh.err

Failed with 'Segmentation fault'; backtrace (libbacktrace): AS_UTL/AS_UTL_stackTrace.C::102 in _Z17AS_UTL_catchCrashiP7siginfoPv() (null)::0 in (null)() (null)::0 in (null)() meryl/estimate-mer-threshold.C::113 in _Z13loadHistogramP8_IO_FILERmS1_S1_RjRPj() meryl/estimate-mer-threshold.C::199 in main() (null)::0 in (null)() (null)::0 in (null)()

By looking at the two previous issues that are related to this error, I can tell that this isn't due to low coverage ( I have ~65 Gbp of data for a ~1.3 Gbp genome) or Canu version (previous issue reported was resolved by using 1.5 over 1.4).

Here's the report attached as well to show sequencing coverage and correction outputs

[CORRECTION/READS]

-- In gatekeeper store 'correction/pacbio_ont_canu_usegrid_mhapthreads8.gkpStore': -- Found 8702274 reads. -- Found 57100801696 bases (43.92 times coverage).

-- Read length histogram (one '*' equals 65881.34 reads): -- 0 4999 3018884 ***** -- 5000 9999 4611694 ** -- 10000 14999 893282 * -- 15000 19999 116035 * -- 20000 24999 36087 -- 25000 29999 14294 -- 30000 34999 6287 -- 35000 39999 2973 -- 40000 44999 1393 -- 45000 49999 650 -- 50000 54999 334 -- 55000 59999 183 -- 60000 64999 81 -- 65000 69999 35 -- 70000 74999 29 -- 75000 79999 12 -- 80000 84999 10 -- 85000 89999 5 -- 90000 94999 1 -- 95000 99999 0 -- 100000 104999 0 -- 105000 109999 0 -- 110000 114999 1 -- 115000 119999 0 -- 120000 124999 2 -- 125000 129999 0 -- 130000 134999 0 -- 135000 139999 0 -- 140000 144999 0 -- 145000 149999 0 -- 150000 154999 0 -- 155000 159999 1 -- 160000 164999 1

[CORRECTION/MERS]

-- 16-mers Fraction -- Occurrences NumMers Unique Total -- 1- 1 200066331 ** 0.1009 0.0035 -- 2- 2 181157775 * 0.1923 0.0099 -- 3- 4 278141437 ** 0.2691 0.0179 -- 5- 7 263413499 ** 0.3850 0.0359 -- 8- 11 203779081 *** 0.4967 0.0626 -- 12- 16 154440635 ** 0.5868 0.0951 -- 17- 22 124935267 **** 0.6582 0.1319 -- 23- 29 107523370 0.7178 0.1742 -- 30- 37 93345021 **** 0.7699 0.2229 -- 38- 46 78897354 0.8154 0.2772 -- 47- 56 64455037 **** 0.8539 0.3345 -- 57- 67 51187791 **** 0.8854 0.3918 -- 68- 79 39816413 ** 0.9105 0.4464 -- 80- 92 30601944 ** 0.9300 0.4968 -- 93- 106 23388698 0.9450 0.5421 -- 107- 121 17865606 * 0.9565 0.5822 -- 122- 137 13692472 0.9653 0.6174 -- 138- 154 10559817 0.9721 0.6480 -- 155- 172 8206195 0.9773 0.6747 -- 173- 191 6427732 0.9814 0.6979 -- 192- 211 5079615 0.9846 0.7182 -- 212- 232 4050496 * 0.9871 0.7359 -- 233- 254 3259273 0.9891 0.7516 -- 255- 277 2639962 0.9908 0.7654 -- 278- 301 2161690 0.9921 0.7776 -- 302- 326 1783826 0.9931 0.7885 -- 327- 352 1481506 0.9940 0.7982 -- 353- 379 1239143 0.9948 0.8070 -- 380- 407 1044438 0.9954 0.8149 -- 408- 436 887268 0.9959 0.8221 -- 437- 466 757208 0.9964 0.8286 -- 467- 497 651382 0.9967 0.8346 -- 498- 529 563881 0.9971 0.8401 -- 530- 562 487943 0.9974 0.8451 -- 563- 596 426213 0.9976 0.8498 -- 597- 631 374233 0.9978 0.8541 -- 632- 667 330055 0.9980 0.8581 -- 668- 704 292532 0.9982 0.8619 -- 705- 742 259606 0.9983 0.8654 -- 743- 781 232106 0.9984 0.8687 -- 782- 821 206959 0.9986 0.8718

-- 6569532 (max occurrences) -- 56770201255 (total mers, non-unique) -- 1782689219 (distinct mers, non-unique) -- 200066331 (unique mers)

[CORRECTION/CORRECTIONS]

-- Reads to be corrected: -- 8687261 reads longer than 3789 bp -- 56932820081 bp -- Expected corrected reads: -- 8687261 reads -- 50258010214 bp -- 0 bp minimum length -- 5785 bp mean length -- 25047 bp n50 length

[TRIMMING/READS]

-- In gatekeeper store 'trimming/pacbio_ont_canu_usegrid_mhapthreads8.gkpStore': -- Found 7705421 reads. -- Found 48788219877 bases (37.52 times coverage).

-- Read length histogram (one '*' equals 15657.11 reads): -- 0 999 0 -- 1000 1999 304797 * -- 2000 2999 705013 *** -- 3000 3999 804591 ***** -- 4000 4999 985735 ** -- 5000 5999 1095998 ** -- 6000 6999 1041249 ** -- 7000 7999 859563 ** -- 8000 8999 683676 * -- 9000 9999 459905 *** -- 10000 10999 286090 ** -- 11000 11999 169905 ** -- 12000 12999 97601 ** -- 13000 13999 57229 * -- 14000 14999 35750 * -- 15000 15999 24209 -- 16000 16999 18021 * -- 17000 17999 13504 -- 18000 18999 10547 -- 19000 19999 8594 -- 20000 20999 6971 -- 21000 21999 5816 -- 22000 22999 4671 -- 23000 23999 4020 -- 24000 24999 3327 -- 25000 25999 2750 -- 26000 26999 2374 -- 27000 27999 2047 -- 28000 28999 1663 -- 29000 29999 1512 -- 30000 30999 1276 -- 31000 31999 1036 -- 32000 32999 878 -- 33000 33999 705 -- 34000 34999 647 -- 35000 35999 596 -- 36000 36999 416 -- 37000 37999 400 -- 38000 38999 353 -- 39000 39999 277 -- 40000 40999 242 -- 41000 41999 205 -- 42000 42999 177 -- 43000 43999 168 -- 44000 44999 124 -- 45000 45999 100 -- 46000 46999 91 -- 47000 47999 84 -- 48000 48999 78 -- 49000 49999 68 -- 50000 50999 40 -- 51000 51999 48 -- 52000 52999 41 -- 53000 53999 26 -- 54000 54999 27 -- 55000 55999 31 -- 56000 56999 16 -- 57000 57999 20 -- 58000 58999 14 -- 59000 59999 19 -- 60000 60999 20 -- 61000 61999 8 -- 62000 62999 7 -- 63000 63999 14 -- 64000 64999 3 -- 65000 65999 3 -- 66000 66999 1 -- 67000 67999 2 -- 68000 68999 3 -- 69000 69999 4 -- 70000 70999 4 -- 71000 71999 3 -- 72000 72999 3 -- 73000 73999 2 -- 74000 74999 2 -- 75000 75999 1 -- 76000 76999 2 -- 77000 77999 2 -- 78000 78999 1 -- 79000 79999 0 -- 80000 80999 2 -- 81000 81999 0 -- 82000 82999 1 -- 83000 83999 1 -- 84000 84999 0 -- 85000 85999 0 -- 86000 86999 1

[TRIMMING/MERS]

-- 22-mers Fraction -- Occurrences NumMers Unique Total -- 1- 1 5026153850 ***--> 0.6718 0.1034 -- 2- 2 605902568 ** 0.7528 0.1283 -- 3- 4 388973948 **** 0.7858 0.1435 -- 5- 7 231171922 ** 0.8178 0.1652 -- 8- 11 178641044 **** 0.8426 0.1915 -- 12- 16 175497123 **** 0.8645 0.2266 -- 17- 22 209503543 **** 0.8875 0.2796 -- 23- 29 278347530 **** 0.9161 0.3700 -- 30- 37 245958835 **** 0.9534 0.5250 -- 38- 46 90330100 ** 0.9836 0.6827 -- 47- 56 16395945 * 0.9936 0.7471 -- 57- 67 7559821 0.9955 0.7621 -- 68- 79 5073382 0.9964 0.7714 -- 80- 92 3588792 0.9971 0.7788 -- 93- 106 2738522 0.9976 0.7850 -- 107- 121 2147852 0.9979 0.7905 -- 122- 137 1728147 0.9982 0.7955 -- 138- 154 1416980 0.9984 0.8000 -- 155- 172 1178589 0.9986 0.8042 -- 173- 191 984917 0.9988 0.8082 -- 192- 211 841272 0.9989 0.8118 -- 212- 232 729644 0.9990 0.8153 -- 233- 254 638328 0.9991 0.8186 -- 255- 277 551454 0.9992 0.8218 -- 278- 301 472622 0.9993 0.8248 -- 302- 326 413240 0.9993 0.8276 -- 327- 352 364417 0.9994 0.8302 -- 353- 379 321299 0.9994 0.8328 -- 380- 407 285079 0.9995 0.8352 -- 408- 436 256030 0.9995 0.8375 -- 437- 466 229980 0.9995 0.8397 -- 467- 497 208602 0.9996 0.8418 -- 498- 529 187574 0.9996 0.8439 -- 530- 562 171721 0.9996 0.8459 -- 563- 596 155899 0.9997 0.8478 -- 597- 631 143075 0.9997 0.8496 -- 632- 667 130777 0.9997 0.8514 -- 668- 704 120575 0.9997 0.8532 -- 705- 742 110194 0.9997 0.8549 -- 743- 781 101439 0.9997 0.8565 -- 782- 821 95058 0.9998 0.8581

-- 5231977 (max occurrences) -- 43600252151 (total mers, non-unique) -- 2455412178 (distinct mers, non-unique) -- 5026153850 (unique mers)

[TRIMMING/TRIMMING] -- PARAMETERS:


-- 1500 (reads trimmed below this many bases are deleted) -- 0.1050 (use overlaps at or below this fraction error) -- 1 (break region if overlap is less than this long, for 'largest covered' algorithm) -- 1 (break region if overlap coverage is less than this many read, for 'largest covered' algorithm) --
-- INPUT READS:


-- 7705421 reads 48788219877 bases (reads processed) -- 0 reads 0 bases (reads not processed, previously deleted) -- 0 reads 0 bases (reads not processed, in a library where trimming isn't allowed) --
-- OUTPUT READS:


-- 6239424 reads 38600572960 bases (trimmed reads output) -- 1351733 reads 7910057160 bases (reads with no change, kept as is) -- 19155 reads 81913826 bases (reads with no overlaps, deleted) -- 95109 reads 258925182 bases (reads with short trimmed length, deleted) --
-- TRIMMING DETAILS:


-- 4305712 reads 932057235 bases (bases trimmed from the 5' end of a read) -- 4666971 reads 1004693514 bases (bases trimmed from the 3' end of a read)

[TRIMMING/SPLITTING] -- PARAMETERS:


-- 1500 (reads trimmed below this many bases are deleted) -- 0.1050 (use overlaps at or below this fraction error) -- INPUT READS:


-- 7591157 reads 48447380869 bases (reads processed) -- 114264 reads 340839008 bases (reads not processed, previously deleted) -- 0 reads 0 bases (reads not processed, in a library where trimming isn't allowed) --
-- PROCESSED:


-- 0 reads 0 bases (no overlaps) -- 100 reads 470547 bases (no coverage after adjusting for trimming done already) -- 0 reads 0 bases (processed for chimera) -- 0 reads 0 bases (processed for spur) -- 7591057 reads 48446910322 bases (processed for subreads) --
-- READS WITH SIGNALS:


-- 0 reads 0 signals (number of 5' spur signal) -- 0 reads 0 signals (number of 3' spur signal) -- 0 reads 0 signals (number of chimera signal) -- 17553 reads 18090 signals (number of subread signal) --
-- SIGNALS:


-- 0 reads 0 bases (size of 5' spur signal) -- 0 reads 0 bases (size of 3' spur signal) -- 0 reads 0 bases (size of chimera signal) -- 18090 reads 8339251 bases (size of subread signal) --
-- TRIMMING:


-- 7671 reads 22068831 bases (trimmed from the 5' end of the read) -- 9952 reads 32106859 bases (trimmed from the 3' end of the read)

[UNITIGGING/READS]

-- In gatekeeper store 'unitigging/pacbio_ont_canu_usegrid_mhapthreads8.gkpStore': -- Found 7590645 reads. -- Found 46455873331 bases (35.73 times coverage).

-- Read length histogram (one '*' equals 15341.88 reads): -- 0 999 0 -- 1000 1999 340961 ** -- 2000 2999 775130 ** -- 3000 3999 863307 **** -- 4000 4999 1010228 ***** -- 5000 5999 1073932 ** -- 6000 6999 992281 **** -- 7000 7999 805143 **** -- 8000 8999 634307 * -- 9000 9999 421796 *** -- 10000 10999 259139 **** -- 11000 11999 151786 * -- 12000 12999 85770 ** -- 13000 13999 49222 -- 14000 14999 30138 -- 15000 15999 19943 -- 16000 16999 14725 -- 17000 17999 10893 -- 18000 18999 8574 -- 19000 19999 7090 -- 20000 20999 5798 -- 21000 21999 4822 -- 22000 22999 4004 -- 23000 23999 3351 -- 24000 24999 2794 -- 25000 25999 2322 -- 26000 26999 2009 -- 27000 27999 1755 -- 28000 28999 1421 -- 29000 29999 1281 -- 30000 30999 1070 -- 31000 31999 869 -- 32000 32999 710 -- 33000 33999 585 -- 34000 34999 507 -- 35000 35999 483 -- 36000 36999 352 -- 37000 37999 348 -- 38000 38999 294 -- 39000 39999 219 -- 40000 40999 208 -- 41000 41999 150 -- 42000 42999 131 -- 43000 43999 127 -- 44000 44999 94 -- 45000 45999 73 -- 46000 46999 70 -- 47000 47999 65 -- 48000 48999 55 -- 49000 49999 38 -- 50000 50999 39 -- 51000 51999 34 -- 52000 52999 29 -- 53000 53999 22 -- 54000 54999 25 -- 55000 55999 23 -- 56000 56999 7 -- 57000 57999 10 -- 58000 58999 13 -- 59000 59999 15 -- 60000 60999 5 -- 61000 61999 7 -- 62000 62999 5 -- 63000 63999 9 -- 64000 64999 2 -- 65000 65999 2 -- 66000 66999 1 -- 67000 67999 2 -- 68000 68999 2 -- 69000 69999 4 -- 70000 70999 3 -- 71000 71999 1 -- 72000 72999 2 -- 73000 73999 3 -- 74000 74999 2 -- 75000 75999 1 -- 76000 76999 1 -- 77000 77999 2 -- 78000 78999 1 -- 79000 79999 0 -- 80000 80999 0 -- 81000 81999 0 -- 82000 82999 2 -- 83000 83999 1

I would really appreciate your help so that I can get this assembly completed asap.

Many thanks!

kushalsuryamohan commented 6 years ago

I actually installed the latest version of Canu1.6 and it's working (hopefully not speaking too soon).