marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
654 stars 179 forks source link

HiCanu failing for real world HiFi data #2325

Closed Oieswarya closed 2 months ago

Oieswarya commented 3 months ago

I am trying to use HiCanu on some real world HiFi data that I downloaded from NCBI but for both the datasets it is not giving me any output. I have already used HiCanu for my simulated data (all with 10x coverage).

This is the command I used for running the real world data: ~/bin/canu -assemble -p Nibea -d /home/CanuOutputs/Nibea/ genomeSize=615m -pacbio-hifi /home/SRAFiles/NibeaCoibor.fa

This is my job.e log file:

canu 2.2

CONFIGURE CANU

Detected Java(TM) Runtime Environment '1.8.0_272' (from 'java') with -d64 support.
 Detected gnuplot version '4.2 patchlevel 6 ' (from 'gnuplot') and image format 'png'.

 Detected 1 CPUs and 252 gigabytes of memory on the local machine.

 Detected PBS/Torque '5.1.3' with 'pbsnodes' binary in /usr/bin/pbsnodes.
 Detecting PBS/Torque resources.

 PBS/Torque support detected.  Resources available:
     1 host  with  64 cores and  252 GB memory.
     1 host  with  72 cores and  252 GB memory.
     6 hosts with  64 cores and  126 GB memory.
    1 host  with  24 cores and  252 GB memory.
    6 hosts with  72 cores and  125 GB memory.
     2 hosts with  40 cores and  252 GB memory.

                         (tag)Threads
               (tag)Memory         |
       (tag)             |         |  algorithm
       -------  ----------  --------  -----------------------------
 Grid:  meryl     13.000 GB    8 CPUs  (k-mer counting)
 Grid:  hap       12.000 GB    8 CPUs  (read-to-haplotype assignment)
Grid:  cormhap   20.000 GB   12 CPUs  (overlap detection with mhap)
Grid:  obtovl    16.000 GB   12 CPUs  (overlap detection)
Grid:  utgovl    16.000 GB   12 CPUs  (overlap detection)
Grid:  cor        -.--- GB    4 CPUs  (read correction)
Grid:  ovb        4.000 GB    1 CPU   (overlap store bucketizer)
Grid:  ovs       16.000 GB    1 CPU   (overlap store sorting)
Grid:  red       17.000 GB    8 CPUs  (read error detection)
Grid:  oea        8.000 GB    1 CPU   (overlap error adjustment)
Grid:  bat      252.000 GB   16 CPUs  (contig construction with bogart)
Grid:  cns        -.--- GB    8 CPUs  (consensus)

 Found trimmed raw PacBio HiFi reads in the input files.

Generating assembly 'Nibea' in '/home/CanuOutputs/Nibea':
  genomeSize:
     615000000

  Overlap Generation Limits:
    corOvlErrorRate 0.0000 (  0.00%)
    obtOvlErrorRate 0.0250 (  2.50%)
    utgOvlErrorRate 0.0100 (  1.00%)

   Overlap Processing Limits:
     corErrorRate    0.0000 (  0.00%)
     obtErrorRate    0.0250 (  2.50%)
     utgErrorRate    0.0003 (  0.03%)
     cnsErrorRate    0.0500 (  5.00%)

   Stages to run:
   assemble HiFi reads.

 Correction skipped; not enabled.

 Trimming skipped; not enabled.

BEGIN ASSEMBLY

Starting command on Fri Jun 21 13:14:39 2024 with 45186.208 GB free disk space

    cd .
    ./Nibea.seqStore.sh \
    > ./Nibea.seqStore.err 2>&1

Finished on Fri Jun 21 13:21:09 2024 (390 seconds) with 45180.14 GB free disk space

In sequence store './Nibea.seqStore':
 Found 1919461 reads.
 Found 29259679568 bases (47.57 times coverage).
    Histogram of corrected reads:

  G=29259679568                      sum of  ||               length     num
   NG         length     index       lengths  ||                range    seqs

    00010        19729    137059   2925977360  ||       9956-10551       12111|-----
    00020        18158    292134   5851941183  ||      10552-11147       24390|---------
    00030        17056    458593   8777910356  ||      11148-11743       71448|-------------------------
    00040        16148    634993  11703878635  ||      11744-12339      137671|------------------------------------------------
    00050        15335    820976  14629854418  ||      12340-12935      178710|---------------------------------------------------------------
    00060        14576   1016711  17555807793  ||      12936-13531      181045|---------------------------------------------------------------
    00070        13842   1222700  20481784186  ||      13532-14127      173229|-------------------------------------------------------------
    00080        13115   1439830  23407745673  ||      14128-14723      163730|---------------------------------------------------------
    00090        12355   1669496  26333723186  ||      14724-15319      152301|-----------------------------------------------------
    00100         9956   1919460  29259679568  ||      15320-15915      139169|-------------------------------------------------
    001.000x             1919461  29259679568  ||      15916-16511      125638|--------------------------------------------
                                               ||      16512-17107      110391|---------------------------------------
                                              ||      17108-17703       95375|----------------------------------
                                               ||      17704-18299       79796|----------------------------
                                               ||      18300-18895       66488|------------------------
                                              ||      18896-19491       52927|-------------------
                                              ||      19492-20087       41922|---------------
                                              ||      20088-20683       32421|------------
                                               ||      20684-21279       24621|---------
                                               ||      21280-21875       18105|-------
                                               ||      21876-22471       12807|-----
                                               ||      22472-23067        8902|----
                                               ||      23068-23663        6108|---
                                               ||      23664-24259        3847|--
                                               ||      24260-24855        2393|-
                                              ||      24856-25451        1504|-
                                               ||      25452-26047         930|-
                                               ||      26048-26643         536|-
                                               ||      26644-27239         313|-
                                               ||      27240-27835         208|-
                                              ||      27836-28431         118|-
                                               ||      28432-29027          66|-
                                               ||      29028-29623          51|-
                                             ||      29624-30219          33|-
                                              ||      30220-30815          29|-
                                               ||      30816-31411          25|-
                                               ||      31412-32007          25|-
                                               ||      32008-32603          12|-
                                               ||      32604-33199          24|-
                                               ||      33200-33795           7|-
                                              ||      33796-34391           8|-
                                               ||      34392-34987           2|-
                                               ||      34988-35583           6|-
                                               ||      35584-36179           9|-
                                               ||      36180-36775           1|-
                                              ||      36776-37371           2|-
                                               ||      37372-37967           1|-
                                              ||      37968-38563           4|-
                                               ||      38564-39159           0|
                                               ||      39160-39755           2|-

 In sequence store './Nibea.seqStore':
   Found 1919461 reads.
   Found 29259679568 bases (47.57 times coverage).
    Histogram of corrected-trimmed reads:
        G=29259679568                      sum of  ||               length     num
    NG         length     index       lengths  ||                range    seqs

    00010        19729    137059   2925977360  ||       9956-10551       12111|-----
    00020        18158    292134   5851941183  ||      10552-11147       24390|---------
    00030        17056    458593   8777910356  ||      11148-11743       71448|-------------------------
    00040        16148    634993  11703878635  ||      11744-12339      137671|------------------------------------------------
    00050        15335    820976  14629854418  ||      12340-12935      178710|---------------------------------------------------------------
    00060        14576   1016711  17555807793  ||      12936-13531      181045|---------------------------------------------------------------
    00070        13842   1222700  20481784186  ||      13532-14127      173229|-------------------------------------------------------------
    00080        13115   1439830  23407745673  ||      14128-14723      163730|---------------------------------------------------------
    00090        12355   1669496  26333723186  ||      14724-15319      152301|-----------------------------------------------------
    00100         9956   1919460  29259679568  ||      15320-15915      139169|-------------------------------------------------
    001.000x             1919461  29259679568  ||      15916-16511      125638|--------------------------------------------
                                               ||      16512-17107      110391|---------------------------------------
                                               ||      17108-17703       95375|----------------------------------
                                               ||      17704-18299       79796|----------------------------
                                               ||      18300-18895       66488|------------------------
                                               ||      18896-19491       52927|-------------------
                                               ||      19492-20087       41922|---------------
                                               ||      20088-20683       32421|------------
                                               ||      20684-21279       24621|---------
                                               ||      21280-21875       18105|-------
                                               ||      21876-22471       12807|-----
                                               ||      22472-23067        8902|----
                                               ||      23068-23663        6108|---
                                               ||      23664-24259        3847|--
                                               ||      24260-24855        2393|-
                                              ||      24856-25451        1504|-
                                               ||      25452-26047         930|-
                                              ||      26048-26643         536|-
                                               ||      26644-27239         313|-
                                               ||      27240-27835         208|-
                                               ||      27836-28431         118|-
                                               ||      28432-29027          66|-
                                               ||      29028-29623          51|-
                                               ||      29624-30219          33|-
                                               ||      30220-30815          29|-
                                               ||      30816-31411          25|-
                                               ||      31412-32007          25|-
                                               ||      32008-32603          12|-
                                               ||      32604-33199          24|-
                                               ||      33200-33795           7|-
                                              ||      33796-34391           8|-
                                               ||      34392-34987           2|-
                                               ||      34988-35583           6|-
                                               ||      35584-36179           9|-
                                              ||      36180-36775           1|-
                                               ||      36776-37371           2|-
                                               ||      37372-37967           1|-
                                               ||      37968-38563           4|-
                                               ||      38564-39159           0|
                                               ||      39160-39755           2|-

----------------------------------------
 Starting command on Fri Jun 21 13:21:25 2024 with 45179.015 GB free disk space

    cd unitigging/0-mercounts
    ./meryl-configure.sh \
    > ./meryl-configure.err 2>&1

Finished on Fri Jun 21 13:21:27 2024 (2 seconds) with 45179.015 GB free disk space

  segments   memory batches

        01 11.97 GB       7
        02 11.97 GB       4
        04  8.91 GB       3
        06 11.97 GB       2
        08  8.91 GB       2
        12  6.37 GB       2
        16  4.81 GB       2
        20  4.04 GB       2
        24  3.36 GB       2
        32  2.47 GB       2
        40  2.14 GB       2
        48  1.78 GB       2
        56  1.53 GB       2
        64  1.34 GB       2
        96  0.90 GB       2

  For 1919461 reads with 29259679568 bases, limit to 292 batches.
  Will count kmers using 06 jobs, each using 13 GB and 8 threads.

 Finished stage 'merylConfigure', reset canuIteration.

 Running jobs.  First attempt out of 2.

 'meryl-count.jobSubmit-01.sh' -> job 1663532[].mgt2-ib.local tasks 1-6.

 Starting command on Fri Jun 21 13:21:27 2024 with 45179.015 GB free disk space

    cd /home/obhowmik/CanuOutputs/All_LR_Together/Nibea
    qsub \
      -j oe \
      -d `pwd` \
      -W depend=afteranyarray:1663532[].mgt2-ib.local \
      -l nodes=1:ppn=1,mem=4g   \
      -N 'canu_Nibea' \
      -o canu-scripts/canu.01.out  canu-scripts/canu.01.sh

 Finished on Fri Jun 21 13:21:27 2024 (fast as lightning) with 45179.015 GB free disk space
----------------------------------------

----------------------------------------

--This is the canuscript which I found in the canuscripts folder:

Found perl:
   /usr/bin/perl

Found java:
   /usr/bin/java
   openjdk version "1.8.0_272"

Found canu:
   /home/obhowmik/canu-2.2/bin/canu
   canu 2.2

 CONFIGURE CANU

 Detected Java(TM) Runtime Environment '1.8.0_272' (from 'java') with -d64 support.
 Detected gnuplot version '4.2 patchlevel 6 ' (from 'gnuplot') and image format 'png'.

 Detected 1 CPUs and 252 gigabytes of memory on the local machine.

 Detected PBS/Torque '5.1.3' with 'pbsnodes' binary in /usr/bin/pbsnodes.
Detecting PBS/Torque resources.

 PBS/Torque support detected.  Resources available:
      1 host  with  64 cores and  252 GB memory.
      1 host  with  72 cores and  252 GB memory.
      6 hosts with  64 cores and  126 GB memory.
      1 host  with  24 cores and  252 GB memory.
      6 hosts with  72 cores and  125 GB memory.
      2 hosts with  40 cores and  252 GB memory.

                         (tag)Threads
                (tag)Memory         |
        (tag)             |         |  algorithm

 Grid:  meryl     25.000 GB    8 CPUs  (k-mer counting)
 Grid:  hap       16.000 GB   16 CPUs  (read-to-haplotype assignment)
 Grid:  cormhap   20.000 GB   12 CPUs  (overlap detection with mhap)
 Grid:  obtovl    16.000 GB   12 CPUs  (overlap detection)
 Grid:  utgovl    16.000 GB   12 CPUs  (overlap detection)
 Grid:  cor        -.--- GB    4 CPUs  (read correction)
 Grid:  ovb        4.000 GB    1 CPU   (overlap store bucketizer)
 Grid:  ovs       32.000 GB    1 CPU   (overlap store sorting)
 Grid:  red       17.000 GB    8 CPUs  (read error detection)
 Grid:  oea        8.000 GB    1 CPU   (overlap error adjustment)
 Grid:  bat      252.000 GB   16 CPUs  (contig construction with bogart)
 Grid:  cns        -.--- GB    8 CPUs  (consensus)
--
 Found PacBio HiFi reads in 'Nibea.seqStore':
   Libraries:
     PacBio HiFi:           1
   Reads:
     Corrected:             29259679568
     Corrected and Trimmed: 29259679568

 Generating assembly 'Nibea' in '/home/CanuOutputs/Caddisfly':
   genomeSize:
     1170000000

   Overlap Generation Limits:
     corOvlErrorRate 0.0000 (  0.00%)
     obtOvlErrorRate 0.0250 (  2.50%)
     utgOvlErrorRate 0.0100 (  1.00%)

   Overlap Processing Limits:
     corErrorRate    0.0000 (  0.00%)
    obtErrorRate    0.0250 (  2.50%)
     utgErrorRate    0.0003 (  0.03%)
     cnsErrorRate    0.0500 (  5.00%)

   Stages to run:
     assemble HiFi reads.

 Correction skipped; not enabled.
--
 Trimming skipped; not enabled.
--
 BEGIN ASSEMBLY
--
 Overlap jobs failed, retry.
   job unitigging/1-overlapper/001/000005.ovb FAILED.
   job unitigging/1-overlapper/001/000006.ovb FAILED.
   job unitigging/1-overlapper/001/000007.ovb FAILED.
   job unitigging/1-overlapper/001/000008.ovb FAILED.
   job unitigging/1-overlapper/001/000009.ovb FAILED.
   job unitigging/1-overlapper/001/000010.ovb FAILED.
   job unitigging/1-overlapper/001/000011.ovb FAILED.
   job unitigging/1-overlapper/001/000012.ovb FAILED.
   job unitigging/1-overlapper/001/000013.ovb FAILED.
   job unitigging/1-overlapper/001/000014.ovb FAILED.
   job unitigging/1-overlapper/001/000015.ovb FAILED.
   job unitigging/1-overlapper/001/000016.ovb FAILED.
   job unitigging/1-overlapper/001/000017.ovb FAILED.
   job unitigging/1-overlapper/001/000018.ovb FAILED.
   job unitigging/1-overlapper/001/000019.ovb FAILED.
   job unitigging/1-overlapper/001/000020.ovb FAILED.
   job unitigging/1-overlapper/001/000021.ovb FAILED.
   job unitigging/1-overlapper/001/000022.ovb FAILED.
   job unitigging/1-overlapper/001/000023.ovb FAILED.
   job unitigging/1-overlapper/001/000024.ovb FAILED.
   job unitigging/1-overlapper/001/000025.ovb FAILED.
   job unitigging/1-overlapper/001/000026.ovb FAILED.
   job unitigging/1-overlapper/001/000027.ovb FAILED.
   job unitigging/1-overlapper/001/000028.ovb FAILED.
   job unitigging/1-overlapper/001/000029.ovb FAILED.
   job unitigging/1-overlapper/001/000030.ovb FAILED.
   job unitigging/1-overlapper/001/000031.ovb FAILED.
   job unitigging/1-overlapper/001/000032.ovb FAILED.
   job unitigging/1-overlapper/001/000033.ovb FAILED.
   job unitigging/1-overlapper/001/000034.ovb FAILED.
   job unitigging/1-overlapper/001/000035.ovb FAILED.
   job unitigging/1-overlapper/001/000036.ovb FAILED.
   job unitigging/1-overlapper/001/000037.ovb FAILED.
   job unitigging/1-overlapper/001/000038.ovb FAILED.
   job unitigging/1-overlapper/001/000039.ovb FAILED.
   job unitigging/1-overlapper/001/000040.ovb FAILED.
   job unitigging/1-overlapper/001/000041.ovb FAILED.
   job unitigging/1-overlapper/001/000042.ovb FAILED.
   job unitigging/1-overlapper/001/000043.ovb FAILED.
   job unitigging/1-overlapper/001/000044.ovb FAILED.
   job unitigging/1-overlapper/001/000045.ovb FAILED.
   job unitigging/1-overlapper/001/000046.ovb FAILED.
   job unitigging/1-overlapper/001/000047.ovb FAILED.
   job unitigging/1-overlapper/001/000049.ovb FAILED.
   job unitigging/1-overlapper/001/000051.ovb FAILED.
   job unitigging/1-overlapper/001/000052.ovb FAILED.
   job unitigging/1-overlapper/001/000053.ovb FAILED.
   job unitigging/1-overlapper/001/000055.ovb FAILED.
   job unitigging/1-overlapper/001/000056.ovb FAILED.
   job unitigging/1-overlapper/001/000057.ovb FAILED.
   job unitigging/1-overlapper/001/000058.ovb FAILED.
   job unitigging/1-overlapper/001/000059.ovb FAILED.
   job unitigging/1-overlapper/001/000060.ovb FAILED.
   job unitigging/1-overlapper/001/000061.ovb FAILED.
   job unitigging/1-overlapper/001/000062.ovb FAILED.
   job unitigging/1-overlapper/001/000063.ovb FAILED.
   job unitigging/1-overlapper/001/000064.ovb FAILED.
   job unitigging/1-overlapper/001/000065.ovb FAILED.
   job unitigging/1-overlapper/001/000066.ovb FAILED.
   job unitigging/1-overlapper/001/000067.ovb FAILED.
   job unitigging/1-overlapper/001/000068.ovb FAILED.
   job unitigging/1-overlapper/001/000069.ovb FAILED.
   job unitigging/1-overlapper/001/000070.ovb FAILED.
   job unitigging/1-overlapper/001/000071.ovb FAILED.
   job unitigging/1-overlapper/001/000072.ovb FAILED.
   job unitigging/1-overlapper/001/000073.ovb FAILED.
   job unitigging/1-overlapper/001/000074.ovb FAILED.
   job unitigging/1-overlapper/001/000075.ovb FAILED.
   job unitigging/1-overlapper/001/000076.ovb FAILED.
   job unitigging/1-overlapper/001/000077.ovb FAILED.
   job unitigging/1-overlapper/001/000078.ovb FAILED.
   job unitigging/1-overlapper/001/000079.ovb FAILED.
   job unitigging/1-overlapper/001/000080.ovb FAILED.
   job unitigging/1-overlapper/001/000081.ovb FAILED.
   job unitigging/1-overlapper/001/000082.ovb FAILED.
   job unitigging/1-overlapper/001/000083.ovb FAILED.
   job unitigging/1-overlapper/001/000084.ovb FAILED.
   job unitigging/1-overlapper/001/000085.ovb FAILED.
   job unitigging/1-overlapper/001/000086.ovb FAILED.
   job unitigging/1-overlapper/001/000087.ovb FAILED.
   job unitigging/1-overlapper/001/000088.ovb FAILED.

 Running jobs.  Second attempt out of 2.

 'overlap.jobSubmit-01.sh' -> job 1664060[].mgt2-ib.local tasks 5-47.
 'overlap.jobSubmit-02.sh' -> job 1664061.mgt2-ib.local task 49.
 'overlap.jobSubmit-03.sh' -> job 1664062[].mgt2-ib.local tasks 51-53.
 'overlap.jobSubmit-04.sh' -> job 1664063[].mgt2-ib.local tasks 55-88.

 Starting command on Mon Jun 24 14:45:17 2024 with 45138.439 GB free disk space

    cd /home/CanuOutputs/Caddisfly
    qsub \
      -j oe \
      -d `pwd` \
      -W depend=afteranyarray:1664060[].mgt2-ib.local:1664061.mgt2-ib.local:1664062[].mgt2-ib.local:1664063[].mgt2-ib.local \
      -l nodes=1:ppn=1,mem=5g   \
      -N 'canu_Nibea' \
      -o canu-scripts/canu.04.out  canu-scripts/canu.04.sh
qsub: submit error (Invalid Job Dependency)

 Finished on Mon Jun 24 14:45:18 2024 (one second) with 45138.439 GB free  space

ERROR:
ERROR:  Failed with exit code 208.  (rc=53248)
ERROR:
 Failed to submit Canu executive.  Delay 10 seconds and try again.

 Starting command on Mon Jun 24 14:45:28 2024 with 45138.439 GB free disk space

    cd /home/CanuOutputs/Caddisfly
    qsub \
      -j oe \
      -d `pwd` \
      -W depend=afteranyarray:1664060[].mgt2-ib.local:1664061.mgt2-ib.local:1664062[].mgt2-ib.local:1664063[].mgt2-ib.local \
      -l nodes=1:ppn=1,mem=5g   \
      -N 'canu_Nibea' \
      -o canu-scripts/canu.04.out  canu-scripts/canu.04.sh
qsub: submit error (Invalid Job Dependency)

 Finished on Mon Jun 24 14:45:29 2024 (one second) with 45138.439 GB free disk space

ERROR:
ERROR:  Failed with exit code 208.  (rc=53248)
ERROR:
 Failed to submit Canu executive.  Giving up after two tries.

Is it a memory issue or something else?

skoren commented 3 months ago

This looks the same as #2013. The PBS scheduler is very flaky between versions and we don't have access to a test system. It seems some of your overlap jobs failed and then the re-try didn't like the dependency, likely because there were multiple job arrays submitted at the same time. The question really is why those jobs failed in the first place and if the re-runs succeeded. Check the logs for these jobs (e.g. unitigging/1-overlapper/000005* files) and look for an error message or post them here. If they completed on the second try, you could just re-run the original command to let the assembly continue.

Oieswarya commented 3 months ago

Yes I am indeed running these on PBS nodes.

I looked at the unitigging/1-overlapper/ folder and it contains 88 log files of the manner 'utgovl_Nibea.o166456-88' and it contains:

Found perl:
   /usr/bin/perl

Found java:
   /usr/bin/java
   openjdk version "1.8.0_272"

Found canu:
   /home/canu-2.2/bin/canu
   canu 2.2

Running job 88 based on PBS_ARRAYID=88 and offset=0.
sqCache: found 1919461 corrected-compressed-trimmed reads with 21528575538 bases.
Initializing 12 work areas.
Loading reference reads 1335571-1911189 inclusive.
Loading 575619 reads and 6438438738 bases from range 1335571-1911189 inclusive.
Loading  1335571 <  1338131 <  1911189 -    0.44% - 0.01 GB
Loading  1335571 <  1342698 <  1911189 -    1.24% - 0.03 GB
Loading  1335571 <  1347265 <  1911189 -    2.03% - 0.04 GB
Loading  1335571 <  1351832 <  1911189 -    2.82% - 0.06 GB
Loading  1335571 <  1356399 <  1911189 -    3.62% - 0.07 GB
Loading  1335571 <  1360966 <  1911189 -    4.41% - 0.09 GB
Loading  1335571 <  1365533 <  1911189 -    5.21% - 0.11 GB
Loading  1335571 <  1370100 <  1911189 -    6.00% - 0.12 GB
Loading  1335571 <  1374667 <  1911189 -    6.79% - 0.14 GB
Loading  1335571 <  1379234 <  1911189 -    7.59% - 0.16 GB
Loading  1335571 <  1383801 <  1911189 -    8.38% - 0.17 GB
Loading  1335571 <  1388368 <  1911189 -    9.17% - 0.19 GB
Loading  1335571 <  1392935 <  1911189 -    9.97% - 0.21 GB
Loading  1335571 <  1397502 <  1911189 -   10.76% - 0.22 GB
Loading  1335571 <  1402069 <  1911189 -   11.55% - 0.24 GB
Loading  1335571 <  1406636 <  1911189 -   12.35% - 0.25 GB
Loading  1335571 <  1411203 <  1911189 -   13.14% - 0.27 GB
Loading  1335571 <  1415770 <  1911189 -   13.93% - 0.29 GB
Loading  1335571 <  1420337 <  1911189 -   14.73% - 0.30 GB
Loading  1335571 <  1424904 <  1911189 -   15.52% - 0.32 GB
Loading  1335571 <  1429471 <  1911189 -   16.31% - 0.34 GB
Loading  1335571 <  1434038 <  1911189 -   17.11% - 0.35 GB
Loading  1335571 <  1438605 <  1911189 -   17.90% - 0.37 GB
Loading  1335571 <  1443172 <  1911189 -   18.69% - 0.38 GB
Loading  1335571 <  1447739 <  1911189 -   19.49% - 0.40 GB
Loading  1335571 <  1452306 <  1911189 -   20.28% - 0.42 GB
Loading  1335571 <  1456873 <  1911189 -   21.07% - 0.43 GB
Loading  1335571 <  1461440 <  1911189 -   21.87% - 0.45 GB
Loading  1335571 <  1466007 <  1911189 -   22.66% - 0.47 GB
Loading  1335571 <  1470574 <  1911189 -   23.45% - 0.48 GB
Loading  1335571 <  1475141 <  1911189 -   24.25% - 0.50 GB
Loading  1335571 <  1479708 <  1911189 -   25.04% - 0.52 GB
Loading  1335571 <  1484275 <  1911189 -   25.83% - 0.53 GB
Loading  1335571 <  1488842 <  1911189 -   26.63% - 0.55 GB
Loading  1335571 <  1493409 <  1911189 -   27.42% - 0.56 GB
Loading  1335571 <  1497976 <  1911189 -   28.21% - 0.58 GB
Loading  1335571 <  1502543 <  1911189 -   29.01% - 0.60 GB
Loading  1335571 <  1507110 <  1911189 -   29.80% - 0.61 GB
Loading  1335571 <  1511677 <  1911189 -   30.59% - 0.63 GB
Loading  1335571 <  1516244 <  1911189 -   31.39% - 0.65 GB
Loading  1335571 <  1520811 <  1911189 -   32.18% - 0.66 GB
Loading  1335571 <  1525378 <  1911189 -   32.97% - 0.68 GB
Loading  1335571 <  1529945 <  1911189 -   33.77% - 0.69 GB
Loading  1335571 <  1534512 <  1911189 -   34.56% - 0.71 GB
Loading  1335571 <  1539079 <  1911189 -   35.35% - 0.73 GB
Loading  1335571 <  1543646 <  1911189 -   36.15% - 0.74 GB
Loading  1335571 <  1548213 <  1911189 -   36.94% - 0.76 GB
Loading  1335571 <  1552780 <  1911189 -   37.73% - 0.78 GB
Loading  1335571 <  1557347 <  1911189 -   38.53% - 0.79 GB
Loading  1335571 <  1561914 <  1911189 -   39.32% - 0.81 GB
Loading  1335571 <  1566481 <  1911189 -   40.12% - 0.83 GB
Loading  1335571 <  1571048 <  1911189 -   40.91% - 0.84 GB
Loading  1335571 <  1575615 <  1911189 -   41.70% - 0.86 GB
Loading  1335571 <  1580182 <  1911189 -   42.50% - 0.87 GB
Loading  1335571 <  1584749 <  1911189 -   43.29% - 0.89 GB
Loading  1335571 <  1589316 <  1911189 -   44.08% - 0.91 GB
Loading  1335571 <  1593883 <  1911189 -   44.88% - 0.92 GB
Loading  1335571 <  1598450 <  1911189 -   45.67% - 0.94 GB
Loading  1335571 <  1603017 <  1911189 -   46.46% - 0.96 GB
Loading  1335571 <  1607584 <  1911189 -   47.26% - 0.97 GB
Loading  1335571 <  1612151 <  1911189 -   48.05% - 0.99 GB
Loading  1335571 <  1616718 <  1911189 -   48.84% - 1.00 GB
Loading  1335571 <  1621285 <  1911189 -   49.64% - 1.02 GB
Loading  1335571 <  1625852 <  1911189 -   50.43% - 1.04 GB
Loading  1335571 <  1630419 <  1911189 -   51.22% - 1.05 GB
Loading  1335571 <  1634986 <  1911189 -   52.02% - 1.07 GB
Loading  1335571 <  1639553 <  1911189 -   52.81% - 1.09 GB
Loading  1335571 <  1644120 <  1911189 -   53.60% - 1.10 GB
Loading  1335571 <  1648687 <  1911189 -   54.40% - 1.12 GB
Loading  1335571 <  1653254 <  1911189 -   55.19% - 1.13 GB
Loading  1335571 <  1657821 <  1911189 -   55.98% - 1.15 GB
Loading  1335571 <  1662388 <  1911189 -   56.78% - 1.17 GB
Loading  1335571 <  1666955 <  1911189 -   57.57% - 1.18 GB
Loading  1335571 <  1671522 <  1911189 -   58.36% - 1.20 GB
Loading  1335571 <  1676089 <  1911189 -   59.16% - 1.21 GB
Loading  1335571 <  1680656 <  1911189 -   59.95% - 1.23 GB
Loading  1335571 <  1685223 <  1911189 -   60.74% - 1.25 GB
Loading  1335571 <  1689790 <  1911189 -   61.54% - 1.26 GB
Loading  1335571 <  1694357 <  1911189 -   62.33% - 1.28 GB
Loading  1335571 <  1698924 <  1911189 -   63.12% - 1.30 GB
Loading  1335571 <  1703491 <  1911189 -   63.92% - 1.31 GB
Loading  1335571 <  1708058 <  1911189 -   64.71% - 1.33 GB
Loading  1335571 <  1712625 <  1911189 -   65.50% - 1.35 GB
Loading  1335571 <  1717192 <  1911189 -   66.30% - 1.36 GB
Loading  1335571 <  1721759 <  1911189 -   67.09% - 1.38 GB
Loading  1335571 <  1726326 <  1911189 -   67.88% - 1.39 GB
Loading  1335571 <  1730893 <  1911189 -   68.68% - 1.41 GB
Loading  1335571 <  1735460 <  1911189 -   69.47% - 1.43 GB
Loading  1335571 <  1740027 <  1911189 -   70.26% - 1.44 GB
Loading  1335571 <  1744594 <  1911189 -   71.06% - 1.46 GB
Loading  1335571 <  1749161 <  1911189 -   71.85% - 1.48 GB
Loading  1335571 <  1753728 <  1911189 -   72.64% - 1.49 GB
Loading  1335571 <  1758295 <  1911189 -   73.44% - 1.51 GB
Loading  1335571 <  1762862 <  1911189 -   74.23% - 1.52 GB
Loading  1335571 <  1767429 <  1911189 -   75.03% - 1.54 GB
Loading  1335571 <  1771996 <  1911189 -   75.82% - 1.56 GB
Loading  1335571 <  1776563 <  1911189 -   76.61% - 1.57 GB
Loading  1335571 <  1781130 <  1911189 -   77.41% - 1.59 GB
Loading  1335571 <  1785697 <  1911189 -   78.20% - 1.60 GB
Loading  1335571 <  1790264 <  1911189 -   78.99% - 1.62 GB
Loading  1335571 <  1794831 <  1911189 -   79.79% - 1.64 GB
Loading  1335571 <  1799398 <  1911189 -   80.58% - 1.65 GB
Loading  1335571 <  1803965 <  1911189 -   81.37% - 1.67 GB
Loading  1335571 <  1808532 <  1911189 -   82.17% - 1.69 GB
Loading  1335571 <  1813099 <  1911189 -   82.96% - 1.70 GB
Loading  1335571 <  1817666 <  1911189 -   83.75% - 1.72 GB
Loading  1335571 <  1822233 <  1911189 -   84.55% - 1.73 GB
Loading  1335571 <  1826800 <  1911189 -   85.34% - 1.75 GB
Loading  1335571 <  1831367 <  1911189 -   86.13% - 1.77 GB
Loading  1335571 <  1835934 <  1911189 -   86.93% - 1.78 GB
Loading  1335571 <  1840501 <  1911189 -   87.72% - 1.80 GB
Loading  1335571 <  1845068 <  1911189 -   88.51% - 1.81 GB
Loading  1335571 <  1849635 <  1911189 -   89.31% - 1.83 GB
Loading  1335571 <  1854202 <  1911189 -   90.10% - 1.85 GB
Loading  1335571 <  1858769 <  1911189 -   90.89% - 1.86 GB
Loading  1335571 <  1863336 <  1911189 -   91.69% - 1.88 GB
Loading  1335571 <  1867903 <  1911189 -   92.48% - 1.89 GB
Loading  1335571 <  1872470 <  1911189 -   93.27% - 1.91 GB
Loading  1335571 <  1877037 <  1911189 -   94.07% - 1.92 GB
Loading  1335571 <  1881604 <  1911189 -   94.86% - 1.94 GB
Loading  1335571 <  1886171 <  1911189 -   95.65% - 1.96 GB
Loading  1335571 <  1890738 <  1911189 -   96.45% - 1.97 GB
Loading  1335571 <  1895305 <  1911189 -   97.24% - 1.99 GB
Loading  1335571 <  1899872 <  1911189 -   98.03% - 2.00 GB
Loading  1335571 <  1904439 <  1911189 -   98.83% - 2.02 GB
Loading  1335571 <  1909006 <  1911189 -   99.62% - 2.04 GB
Loading  1335571 <  1911189 <  1911189 -  100.00% - 2.04 GB
Build_Hash_Index from 1881876 to 1911189

Found 29314 reads with length 320005357 to load; 0 skipped by being too short; 0 skipped per library restriction
String_Ct:           0/       29314  totalLen:        8697/   320005357  Hash_Entries:        6962/   281857228  Load: 0.00%
HASH LOADING STOPPED: curID         1911189 out of      1911189
HASH LOADING STOPPED: length      320005357 out of    320005357 max.
HASH LOADING STOPPED: entries     239681128 out of    281857228 max (load 68.03).

Read 390948 kmers to mark to skip

Range: 1335571-1911189.  Store has 1919461 reads.
Chunk: 5997 reads/thread -- (G.endRefID=1911189 - G.bgnRefID=1335571) / G.Num_PThreads=12 / 8

Starting 1335571-1911189 with 5997 per thread

Thread 00 processes reads 1335571-1341567
Thread 06 processes reads 1371553-1377549
Thread 04 processes reads 1359559-1365555
Thread 03 processes reads 1353562-1359558
Thread 02 processes reads 1347565-1353561
Thread 08 processes reads 1383547-1389543
Thread 10 processes reads 1395541-1401537
Thread 05 processes reads 1365556-1371552
Thread 09 processes reads 1389544-1395540
Thread 07 processes reads 1377550-1383546
Thread 11 processes reads 1401538-1407534
Thread 01 processes reads 1341568-1347564
Thread 05 writes    reads 1365556-1371552 (8310 overlaps 8310/4880847/0 kmer hits with/without overlap/skipped)
Thread 05 processes reads 1407535-1413531
Thread 07 writes    reads 1377550-1383546 (8145 overlaps 8145/4942632/0 kmer hits with/without overlap/skipped)
Thread 07 processes reads 1413532-1419528
Thread 01 writes    reads 1341568-1347564 (8132 overlaps 8132/4879816/0 kmer hits with/without overlap/skipped)
Thread 01 processes reads 1419529-1425525
Thread 02 writes    reads 1347565-1353561 (8079 overlaps 8079/4897256/0 kmer hits with/without overlap/skipped)
Thread 02 processes reads 1425526-1431522
Thread 08 writes    reads 1383547-1389543 (8305 overlaps 8305/4949791/0 kmer hits with/without overlap/skipped)
Thread 08 processes reads 1431523-1437519
Thread 10 writes    reads 1395541-1401537 (8233 overlaps 8233/4914756/0 kmer hits with/without overlap/skipped)
Thread 10 processes reads 1437520-1443516
Thread 03 writes    reads 1353562-1359558 (8258 overlaps 8258/4917783/0 kmer hits with/without overlap/skipped)
Thread 03 processes reads 1443517-1449513
Thread 00 writes    reads 1335571-1341567 (8112 overlaps 8112/4923062/0 kmer hits with/without overlap/skipped)
Thread 00 processes reads 1449514-1455510
Thread 11 writes    reads 1401538-1407534 (8096 overlaps 8096/4935199/0 kmer hits with/without overlap/skipped)
Thread 11 processes reads 1455511-1461507
Thread 04 writes    reads 1359559-1365555 (8186 overlaps 8186/4872632/0 kmer hits with/without overlap/skipped)
Thread 04 processes reads 1461508-1467504
Thread 06 writes    reads 1371553-1377549 (8028 overlaps 8028/4939864/0 kmer hits with/without overlap/skipped)
Thread 06 processes reads 1467505-1473501
Thread 09 writes    reads 1389544-1395540 (8184 overlaps 8184/4952636/0 kmer hits with/without overlap/skipped)
Thread 09 processes reads 1473502-1479498
=>> PBS: job killed: walltime 3643 exceeded limit 3600

I am guessing this is because the time is being exceeded. I submitted my original job script with 6 hours of time for running the complete assembly pipeline.

skoren commented 3 months ago

The time for the initial job doesn't matter since canu runs by submitting array jobs then itself to the grid so the initial job quits as soon as it submits the array job. You should provide an increased runtime for canu to use for all jobs it submits, I think you need gridOptions="-l walltime=24:00:00" or whatever runtime your partition allows. You can also add any other options you want for your scheduler this way, like partition/etc but not specific resources as those are automatically requested by canu.

skoren commented 2 months ago

Idle, missing grid time request.