Closed Oieswarya closed 4 months ago
This looks the same as #2013. The PBS scheduler is very flaky between versions and we don't have access to a test system. It seems some of your overlap jobs failed and then the re-try didn't like the dependency, likely because there were multiple job arrays submitted at the same time. The question really is why those jobs failed in the first place and if the re-runs succeeded. Check the logs for these jobs (e.g. unitigging/1-overlapper/000005*
files) and look for an error message or post them here. If they completed on the second try, you could just re-run the original command to let the assembly continue.
Yes I am indeed running these on PBS nodes.
I looked at the unitigging/1-overlapper/ folder and it contains 88 log files of the manner 'utgovl_Nibea.o166456-88' and it contains:
Found perl:
/usr/bin/perl
Found java:
/usr/bin/java
openjdk version "1.8.0_272"
Found canu:
/home/canu-2.2/bin/canu
canu 2.2
Running job 88 based on PBS_ARRAYID=88 and offset=0.
sqCache: found 1919461 corrected-compressed-trimmed reads with 21528575538 bases.
Initializing 12 work areas.
Loading reference reads 1335571-1911189 inclusive.
Loading 575619 reads and 6438438738 bases from range 1335571-1911189 inclusive.
Loading 1335571 < 1338131 < 1911189 - 0.44% - 0.01 GB
Loading 1335571 < 1342698 < 1911189 - 1.24% - 0.03 GB
Loading 1335571 < 1347265 < 1911189 - 2.03% - 0.04 GB
Loading 1335571 < 1351832 < 1911189 - 2.82% - 0.06 GB
Loading 1335571 < 1356399 < 1911189 - 3.62% - 0.07 GB
Loading 1335571 < 1360966 < 1911189 - 4.41% - 0.09 GB
Loading 1335571 < 1365533 < 1911189 - 5.21% - 0.11 GB
Loading 1335571 < 1370100 < 1911189 - 6.00% - 0.12 GB
Loading 1335571 < 1374667 < 1911189 - 6.79% - 0.14 GB
Loading 1335571 < 1379234 < 1911189 - 7.59% - 0.16 GB
Loading 1335571 < 1383801 < 1911189 - 8.38% - 0.17 GB
Loading 1335571 < 1388368 < 1911189 - 9.17% - 0.19 GB
Loading 1335571 < 1392935 < 1911189 - 9.97% - 0.21 GB
Loading 1335571 < 1397502 < 1911189 - 10.76% - 0.22 GB
Loading 1335571 < 1402069 < 1911189 - 11.55% - 0.24 GB
Loading 1335571 < 1406636 < 1911189 - 12.35% - 0.25 GB
Loading 1335571 < 1411203 < 1911189 - 13.14% - 0.27 GB
Loading 1335571 < 1415770 < 1911189 - 13.93% - 0.29 GB
Loading 1335571 < 1420337 < 1911189 - 14.73% - 0.30 GB
Loading 1335571 < 1424904 < 1911189 - 15.52% - 0.32 GB
Loading 1335571 < 1429471 < 1911189 - 16.31% - 0.34 GB
Loading 1335571 < 1434038 < 1911189 - 17.11% - 0.35 GB
Loading 1335571 < 1438605 < 1911189 - 17.90% - 0.37 GB
Loading 1335571 < 1443172 < 1911189 - 18.69% - 0.38 GB
Loading 1335571 < 1447739 < 1911189 - 19.49% - 0.40 GB
Loading 1335571 < 1452306 < 1911189 - 20.28% - 0.42 GB
Loading 1335571 < 1456873 < 1911189 - 21.07% - 0.43 GB
Loading 1335571 < 1461440 < 1911189 - 21.87% - 0.45 GB
Loading 1335571 < 1466007 < 1911189 - 22.66% - 0.47 GB
Loading 1335571 < 1470574 < 1911189 - 23.45% - 0.48 GB
Loading 1335571 < 1475141 < 1911189 - 24.25% - 0.50 GB
Loading 1335571 < 1479708 < 1911189 - 25.04% - 0.52 GB
Loading 1335571 < 1484275 < 1911189 - 25.83% - 0.53 GB
Loading 1335571 < 1488842 < 1911189 - 26.63% - 0.55 GB
Loading 1335571 < 1493409 < 1911189 - 27.42% - 0.56 GB
Loading 1335571 < 1497976 < 1911189 - 28.21% - 0.58 GB
Loading 1335571 < 1502543 < 1911189 - 29.01% - 0.60 GB
Loading 1335571 < 1507110 < 1911189 - 29.80% - 0.61 GB
Loading 1335571 < 1511677 < 1911189 - 30.59% - 0.63 GB
Loading 1335571 < 1516244 < 1911189 - 31.39% - 0.65 GB
Loading 1335571 < 1520811 < 1911189 - 32.18% - 0.66 GB
Loading 1335571 < 1525378 < 1911189 - 32.97% - 0.68 GB
Loading 1335571 < 1529945 < 1911189 - 33.77% - 0.69 GB
Loading 1335571 < 1534512 < 1911189 - 34.56% - 0.71 GB
Loading 1335571 < 1539079 < 1911189 - 35.35% - 0.73 GB
Loading 1335571 < 1543646 < 1911189 - 36.15% - 0.74 GB
Loading 1335571 < 1548213 < 1911189 - 36.94% - 0.76 GB
Loading 1335571 < 1552780 < 1911189 - 37.73% - 0.78 GB
Loading 1335571 < 1557347 < 1911189 - 38.53% - 0.79 GB
Loading 1335571 < 1561914 < 1911189 - 39.32% - 0.81 GB
Loading 1335571 < 1566481 < 1911189 - 40.12% - 0.83 GB
Loading 1335571 < 1571048 < 1911189 - 40.91% - 0.84 GB
Loading 1335571 < 1575615 < 1911189 - 41.70% - 0.86 GB
Loading 1335571 < 1580182 < 1911189 - 42.50% - 0.87 GB
Loading 1335571 < 1584749 < 1911189 - 43.29% - 0.89 GB
Loading 1335571 < 1589316 < 1911189 - 44.08% - 0.91 GB
Loading 1335571 < 1593883 < 1911189 - 44.88% - 0.92 GB
Loading 1335571 < 1598450 < 1911189 - 45.67% - 0.94 GB
Loading 1335571 < 1603017 < 1911189 - 46.46% - 0.96 GB
Loading 1335571 < 1607584 < 1911189 - 47.26% - 0.97 GB
Loading 1335571 < 1612151 < 1911189 - 48.05% - 0.99 GB
Loading 1335571 < 1616718 < 1911189 - 48.84% - 1.00 GB
Loading 1335571 < 1621285 < 1911189 - 49.64% - 1.02 GB
Loading 1335571 < 1625852 < 1911189 - 50.43% - 1.04 GB
Loading 1335571 < 1630419 < 1911189 - 51.22% - 1.05 GB
Loading 1335571 < 1634986 < 1911189 - 52.02% - 1.07 GB
Loading 1335571 < 1639553 < 1911189 - 52.81% - 1.09 GB
Loading 1335571 < 1644120 < 1911189 - 53.60% - 1.10 GB
Loading 1335571 < 1648687 < 1911189 - 54.40% - 1.12 GB
Loading 1335571 < 1653254 < 1911189 - 55.19% - 1.13 GB
Loading 1335571 < 1657821 < 1911189 - 55.98% - 1.15 GB
Loading 1335571 < 1662388 < 1911189 - 56.78% - 1.17 GB
Loading 1335571 < 1666955 < 1911189 - 57.57% - 1.18 GB
Loading 1335571 < 1671522 < 1911189 - 58.36% - 1.20 GB
Loading 1335571 < 1676089 < 1911189 - 59.16% - 1.21 GB
Loading 1335571 < 1680656 < 1911189 - 59.95% - 1.23 GB
Loading 1335571 < 1685223 < 1911189 - 60.74% - 1.25 GB
Loading 1335571 < 1689790 < 1911189 - 61.54% - 1.26 GB
Loading 1335571 < 1694357 < 1911189 - 62.33% - 1.28 GB
Loading 1335571 < 1698924 < 1911189 - 63.12% - 1.30 GB
Loading 1335571 < 1703491 < 1911189 - 63.92% - 1.31 GB
Loading 1335571 < 1708058 < 1911189 - 64.71% - 1.33 GB
Loading 1335571 < 1712625 < 1911189 - 65.50% - 1.35 GB
Loading 1335571 < 1717192 < 1911189 - 66.30% - 1.36 GB
Loading 1335571 < 1721759 < 1911189 - 67.09% - 1.38 GB
Loading 1335571 < 1726326 < 1911189 - 67.88% - 1.39 GB
Loading 1335571 < 1730893 < 1911189 - 68.68% - 1.41 GB
Loading 1335571 < 1735460 < 1911189 - 69.47% - 1.43 GB
Loading 1335571 < 1740027 < 1911189 - 70.26% - 1.44 GB
Loading 1335571 < 1744594 < 1911189 - 71.06% - 1.46 GB
Loading 1335571 < 1749161 < 1911189 - 71.85% - 1.48 GB
Loading 1335571 < 1753728 < 1911189 - 72.64% - 1.49 GB
Loading 1335571 < 1758295 < 1911189 - 73.44% - 1.51 GB
Loading 1335571 < 1762862 < 1911189 - 74.23% - 1.52 GB
Loading 1335571 < 1767429 < 1911189 - 75.03% - 1.54 GB
Loading 1335571 < 1771996 < 1911189 - 75.82% - 1.56 GB
Loading 1335571 < 1776563 < 1911189 - 76.61% - 1.57 GB
Loading 1335571 < 1781130 < 1911189 - 77.41% - 1.59 GB
Loading 1335571 < 1785697 < 1911189 - 78.20% - 1.60 GB
Loading 1335571 < 1790264 < 1911189 - 78.99% - 1.62 GB
Loading 1335571 < 1794831 < 1911189 - 79.79% - 1.64 GB
Loading 1335571 < 1799398 < 1911189 - 80.58% - 1.65 GB
Loading 1335571 < 1803965 < 1911189 - 81.37% - 1.67 GB
Loading 1335571 < 1808532 < 1911189 - 82.17% - 1.69 GB
Loading 1335571 < 1813099 < 1911189 - 82.96% - 1.70 GB
Loading 1335571 < 1817666 < 1911189 - 83.75% - 1.72 GB
Loading 1335571 < 1822233 < 1911189 - 84.55% - 1.73 GB
Loading 1335571 < 1826800 < 1911189 - 85.34% - 1.75 GB
Loading 1335571 < 1831367 < 1911189 - 86.13% - 1.77 GB
Loading 1335571 < 1835934 < 1911189 - 86.93% - 1.78 GB
Loading 1335571 < 1840501 < 1911189 - 87.72% - 1.80 GB
Loading 1335571 < 1845068 < 1911189 - 88.51% - 1.81 GB
Loading 1335571 < 1849635 < 1911189 - 89.31% - 1.83 GB
Loading 1335571 < 1854202 < 1911189 - 90.10% - 1.85 GB
Loading 1335571 < 1858769 < 1911189 - 90.89% - 1.86 GB
Loading 1335571 < 1863336 < 1911189 - 91.69% - 1.88 GB
Loading 1335571 < 1867903 < 1911189 - 92.48% - 1.89 GB
Loading 1335571 < 1872470 < 1911189 - 93.27% - 1.91 GB
Loading 1335571 < 1877037 < 1911189 - 94.07% - 1.92 GB
Loading 1335571 < 1881604 < 1911189 - 94.86% - 1.94 GB
Loading 1335571 < 1886171 < 1911189 - 95.65% - 1.96 GB
Loading 1335571 < 1890738 < 1911189 - 96.45% - 1.97 GB
Loading 1335571 < 1895305 < 1911189 - 97.24% - 1.99 GB
Loading 1335571 < 1899872 < 1911189 - 98.03% - 2.00 GB
Loading 1335571 < 1904439 < 1911189 - 98.83% - 2.02 GB
Loading 1335571 < 1909006 < 1911189 - 99.62% - 2.04 GB
Loading 1335571 < 1911189 < 1911189 - 100.00% - 2.04 GB
Build_Hash_Index from 1881876 to 1911189
Found 29314 reads with length 320005357 to load; 0 skipped by being too short; 0 skipped per library restriction
String_Ct: 0/ 29314 totalLen: 8697/ 320005357 Hash_Entries: 6962/ 281857228 Load: 0.00%
HASH LOADING STOPPED: curID 1911189 out of 1911189
HASH LOADING STOPPED: length 320005357 out of 320005357 max.
HASH LOADING STOPPED: entries 239681128 out of 281857228 max (load 68.03).
Read 390948 kmers to mark to skip
Range: 1335571-1911189. Store has 1919461 reads.
Chunk: 5997 reads/thread -- (G.endRefID=1911189 - G.bgnRefID=1335571) / G.Num_PThreads=12 / 8
Starting 1335571-1911189 with 5997 per thread
Thread 00 processes reads 1335571-1341567
Thread 06 processes reads 1371553-1377549
Thread 04 processes reads 1359559-1365555
Thread 03 processes reads 1353562-1359558
Thread 02 processes reads 1347565-1353561
Thread 08 processes reads 1383547-1389543
Thread 10 processes reads 1395541-1401537
Thread 05 processes reads 1365556-1371552
Thread 09 processes reads 1389544-1395540
Thread 07 processes reads 1377550-1383546
Thread 11 processes reads 1401538-1407534
Thread 01 processes reads 1341568-1347564
Thread 05 writes reads 1365556-1371552 (8310 overlaps 8310/4880847/0 kmer hits with/without overlap/skipped)
Thread 05 processes reads 1407535-1413531
Thread 07 writes reads 1377550-1383546 (8145 overlaps 8145/4942632/0 kmer hits with/without overlap/skipped)
Thread 07 processes reads 1413532-1419528
Thread 01 writes reads 1341568-1347564 (8132 overlaps 8132/4879816/0 kmer hits with/without overlap/skipped)
Thread 01 processes reads 1419529-1425525
Thread 02 writes reads 1347565-1353561 (8079 overlaps 8079/4897256/0 kmer hits with/without overlap/skipped)
Thread 02 processes reads 1425526-1431522
Thread 08 writes reads 1383547-1389543 (8305 overlaps 8305/4949791/0 kmer hits with/without overlap/skipped)
Thread 08 processes reads 1431523-1437519
Thread 10 writes reads 1395541-1401537 (8233 overlaps 8233/4914756/0 kmer hits with/without overlap/skipped)
Thread 10 processes reads 1437520-1443516
Thread 03 writes reads 1353562-1359558 (8258 overlaps 8258/4917783/0 kmer hits with/without overlap/skipped)
Thread 03 processes reads 1443517-1449513
Thread 00 writes reads 1335571-1341567 (8112 overlaps 8112/4923062/0 kmer hits with/without overlap/skipped)
Thread 00 processes reads 1449514-1455510
Thread 11 writes reads 1401538-1407534 (8096 overlaps 8096/4935199/0 kmer hits with/without overlap/skipped)
Thread 11 processes reads 1455511-1461507
Thread 04 writes reads 1359559-1365555 (8186 overlaps 8186/4872632/0 kmer hits with/without overlap/skipped)
Thread 04 processes reads 1461508-1467504
Thread 06 writes reads 1371553-1377549 (8028 overlaps 8028/4939864/0 kmer hits with/without overlap/skipped)
Thread 06 processes reads 1467505-1473501
Thread 09 writes reads 1389544-1395540 (8184 overlaps 8184/4952636/0 kmer hits with/without overlap/skipped)
Thread 09 processes reads 1473502-1479498
=>> PBS: job killed: walltime 3643 exceeded limit 3600
I am guessing this is because the time is being exceeded. I submitted my original job script with 6 hours of time for running the complete assembly pipeline.
The time for the initial job doesn't matter since canu runs by submitting array jobs then itself to the grid so the initial job quits as soon as it submits the array job. You should provide an increased runtime for canu to use for all jobs it submits, I think you need gridOptions="-l walltime=24:00:00"
or whatever runtime your partition allows. You can also add any other options you want for your scheduler this way, like partition/etc but not specific resources as those are automatically requested by canu.
Idle, missing grid time request.
I am trying to use HiCanu on some real world HiFi data that I downloaded from NCBI but for both the datasets it is not giving me any output. I have already used HiCanu for my simulated data (all with 10x coverage).
This is the command I used for running the real world data: ~/bin/canu -assemble -p Nibea -d /home/CanuOutputs/Nibea/ genomeSize=615m -pacbio-hifi /home/SRAFiles/NibeaCoibor.fa
This is my job.e log file:
Is it a memory issue or something else?