Closed GeoMicroSoares closed 5 years ago
That option is attempting to request memory from SGE; the option should be something like "-l memory=4g", but you're missing the name, somehow.
Near the top of the canu screen output (just after the CITATIONS) canu will report what it thinks your grid is using, something like:
-- Detected Sun Grid Engine in '/usr/local/sge/grid'.
-- Detected Sun Grid Engine parallel environment 'thread'.
-- Detected Sun Grid Engine memory resource 'memory'.
The 'memory' name is coming from the output of qconf -sc
:
> qconf -sc | grep MEMORY
[...]
mem_used mu MEMORY >= YES NO 0 0
memory memory MEMORY <= YES YES 0 0
[...]
(the line with YES in the 6th column is used).
What does your canu and qconf -sc
report?
Hi @brianwalenz,
Yep, things run after checking that out and changing the command to ./canu/Linux-amd64/bin/canu -p metagenome -d /scratch/metagenome_folder genomeSize=1g -nanopore-raw metagenome_folder/metagenome.fastq java='/cm/shared/apps/java/jdk1.8.0_40/bin/java' gridEngineResourceOption="-pe multithread THREADS -l mem_free=MEMORY"
. The job gets submitted but ca. 5min later it crashes.
Checking it out I get a java-related error:
Complete documentation at http://canu.readthedocs.org/en/latest/
ERROR: mhap overlapper requires java version at least 1.8.0; you have unknown (from '/cm/shared/apps/java/jdk1.8.0_40/bin/java').
ERROR: '/cm/shared/apps/java/jdk1.8.0_40/bin/java -Xmx1g -showversion' reports:
ERROR: 'Error occurred during initialization of VM'
ERROR: 'Unable to allocate 32768KB bitmaps for parallel garbage collection for the requested 1048576KB heap.'
ERROR: 'Error: Could not create the Java Virtual Machine.'
ERROR: 'Error: A fatal exception has occurred. Program will exit.'
Testing the java version:
$ /cm/shared/apps/java/jdk1.8.0_40/bin/java -version
java version "1.8.0_40"
Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
Not sure what's going on here, if this happens even after I indicate where java is and have the correct version...
Hi again,
Do let me know if you'd rather me open up another issue here, but managed to get through that hurdle and now getting Mhap precomputing jobs failing with ./canu/Linux-amd64/bin/canu -p metagenome -d /scratch/metagenome_folder genomeSize=0.5g -nanopore-raw metagenome_folder/metagenome.fastq java="/cm/shared/apps/java/jdk1.8.0_40/bin/java" gridEngineResourceOption="-pe multithread THREADS -l h_vmem=MEMORY" merylMemory=10 merylThreads=1 cormhapMemory=10 cormhapThreads=1 gridOptionsExecutive="-l h_vmem=100g" gridEngineArrayMaxJobs=5
Bottom of error log here:
$ tail /scratch/metagenome_folder/canu.out -n100
-- WARNING: If jobs fail to start, modify the above option to use a valid shell
-- WARNING: and supply it directly to canu.
-- WARNING:
--
-- Found 1 host with 32 cores and 504 GB memory under Sun Grid Engine control.
-- Found 3 hosts with 64 cores and 252 GB memory under Sun Grid Engine control.
-- Found 1 host with 16 cores and 1009 GB memory under Sun Grid Engine control.
-- Found 2 hosts with 8 cores and 189 GB memory under Sun Grid Engine control.
-- Found 1 host with 64 cores and 236 GB memory under Sun Grid Engine control.
-- Found 1 host with 8 cores and 23 GB memory under Sun Grid Engine control.
-- Found 3 hosts with 32 cores and 94 GB memory under Sun Grid Engine control.
-- Found 1 host with 64 cores and 504 GB memory under Sun Grid Engine control.
--
-- (tag)Threads
-- (tag)Memory |
-- (tag) | | algorithm
-- ------- ------ -------- -----------------------------
-- Grid: meryl 10 GB 1 CPU (k-mer counting)
-- Grid: hap 12 GB 8 CPUs (read-to-haplotype assignment)
-- Grid: cormhap 10 GB 1 CPU (overlap detection with mhap)
-- Grid: obtovl 16 GB 8 CPUs (overlap detection)
-- Grid: utgovl 16 GB 8 CPUs (overlap detection)
-- Grid: cor --- GB 4 CPUs (read correction)
-- Grid: ovb 4 GB 1 CPU (overlap store bucketizer)
-- Grid: ovs 16 GB 1 CPU (overlap store sorting)
-- Grid: red 12 GB 8 CPUs (read error detection)
-- Grid: oea 4 GB 1 CPU (overlap error adjustment)
-- Grid: bat 256 GB 16 CPUs (contig construction with bogart)
-- Grid: cns --- GB 8 CPUs (consensus)
-- Grid: gfa 16 GB 16 CPUs (GFA alignment and processing)
--
-- In 'metagenome.seqStore', found Nanopore reads:
-- Raw: 459052
-- Corrected: 0
-- Trimmed: 0
--
-- Generating assembly 'metagenome' in '/scratch/metagenome_folder'
--
-- Parameters:
--
-- genomeSize 500000000
--
-- Overlap Generation Limits:
-- corOvlErrorRate 0.3200 ( 32.00%)
-- obtOvlErrorRate 0.1200 ( 12.00%)
-- utgOvlErrorRate 0.1200 ( 12.00%)
--
-- Overlap Processing Limits:
-- corErrorRate 0.5000 ( 50.00%)
-- obtErrorRate 0.1200 ( 12.00%)
-- utgErrorRate 0.1200 ( 12.00%)
-- cnsErrorRate 0.2000 ( 20.00%)
--
--
-- BEGIN CORRECTION
--
--
-- Mhap precompute jobs failed, tried 2 times, giving up.
-- job correction/1-overlapper/blocks/000001.dat FAILED.
-- job correction/1-overlapper/blocks/000002.dat FAILED.
-- job correction/1-overlapper/blocks/000003.dat FAILED.
-- job correction/1-overlapper/blocks/000004.dat FAILED.
-- job correction/1-overlapper/blocks/000005.dat FAILED.
-- job correction/1-overlapper/blocks/000006.dat FAILED.
-- job correction/1-overlapper/blocks/000007.dat FAILED.
-- job correction/1-overlapper/blocks/000008.dat FAILED.
-- job correction/1-overlapper/blocks/000009.dat FAILED.
-- job correction/1-overlapper/blocks/000010.dat FAILED.
-- job correction/1-overlapper/blocks/000011.dat FAILED.
-- job correction/1-overlapper/blocks/000012.dat FAILED.
-- job correction/1-overlapper/blocks/000013.dat FAILED.
-- job correction/1-overlapper/blocks/000014.dat FAILED.
-- job correction/1-overlapper/blocks/000015.dat FAILED.
-- job correction/1-overlapper/blocks/000016.dat FAILED.
-- job correction/1-overlapper/blocks/000017.dat FAILED.
-- job correction/1-overlapper/blocks/000018.dat FAILED.
-- job correction/1-overlapper/blocks/000019.dat FAILED.
-- job correction/1-overlapper/blocks/000020.dat FAILED.
-- job correction/1-overlapper/blocks/000021.dat FAILED.
-- job correction/1-overlapper/blocks/000022.dat FAILED.
-- job correction/1-overlapper/blocks/000023.dat FAILED.
-- job correction/1-overlapper/blocks/000024.dat FAILED.
-- job correction/1-overlapper/blocks/000025.dat FAILED.
-- job correction/1-overlapper/blocks/000026.dat FAILED.
-- job correction/1-overlapper/blocks/000027.dat FAILED.
-- job correction/1-overlapper/blocks/000028.dat FAILED.
-- job correction/1-overlapper/blocks/000029.dat FAILED.
-- job correction/1-overlapper/blocks/000030.dat FAILED.
-- job correction/1-overlapper/blocks/000031.dat FAILED.
-- job correction/1-overlapper/blocks/000032.dat FAILED.
-- job correction/1-overlapper/blocks/000033.dat FAILED.
-- job correction/1-overlapper/blocks/000034.dat FAILED.
-- job correction/1-overlapper/blocks/000035.dat FAILED.
--
ABORT:
ABORT: Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting. If that doesn't work, ask for help.
ABORT:
Tried re-running without success and tweaking the command based on previous issues but without sucess. Help?
I'm going to guess it's still a java problem. Look in the correction/1-overlapper/mhap*out for errors.
It could also be the 'start method' that the logging is complaining about at the start of your paste. Two lines before that is a suggestion to set gridOptions="-S /bin/bash"
(I'm guessing on '/bin/bash'); maybe we're picking an invalid shell.
Ok then, from cat /ibers/ernie/scratch/correction/1-overlapper/precompute.1.out
I'm getting mixed messages. In some instances, getting java fail errors as such:
Found perl:
/usr/bin/perl
Found java:
/cm/shared/apps/java/jdk1.8.0_40/bin/java
Error occurred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 1 based on SGE_TASK_ID=1 and offset=0.
Dumping reads from 1 to 13500 (inclusive).
Starting mhap precompute.
Error occurred during initialization of VM
Could not reserve enough space for 9437184KB object heap
Whilst in other instances it seems java is found but the VM still can't be initialized.
Found perl:
/usr/bin/perl
Found java:
/cm/shared/apps/java/jdk1.8.0_40/bin/java
Found perl:
on "1.8.0_40"
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linu
Found perl:
./lib/site_perl/canu/Grid_Cloud.pm line 74.
/usr/bin/perl
v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Found java:
ased on SGE_TASK_ID=1 and offset=30.
Dumping reads from 405001 to 418500 (inclusiv java version "1.8.0_40"
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
enough space for 9437184KB object heap
Mhap failed.
java version "1.8.0_40"
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 6 based on SGE_TASK_ID=1 and offset=5.
Dumping reads from 67501 to 81000 (inclusive).
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 1 based on SGE_TASK_ID=1 and offset=0.
Dumping reads from 1 to 13500 (inclusive).
Starting mhap precompute.
Error occurred during initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Starting mhap precompute.
Error occurred during initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
In all cases it looks like your JVM is failing to start up. It is trying to reserve 10gb of memory and failing. This is what you configured it to use via cormhapMemory=10 cormhapThreads=1
. The version check explicitly only requests 1gb of memory and it still seems to fail on your nodes.
I noticed you also switched from mem_free to h_vmem, why is that? h_vmem is usually not the right option as it doesn't hold the memory (qconf -sc will say it is not consumable). What does your qconf -sc report for mem_free and h_vmem? Using a non-consumable resource will let too many jobs get scheduled on a single node and overload its memory. How much memory does the node the precompute jobs land on have? How many jobs are getting scheduled on each node simultaneously?
Hi @skoren ,
The switch from mem_free to h_vmem was due to a re-analysis of qconf -sc
outputs. As you see below, h_vmem is the consumable option.
$ qconf -sc
#name shortcut type relop requestable consumable default urgency
#----------------------------------------------------------------------------------------
...
h_vmem h_vmem MEMORY <= YES JOB 2G 0
mem_free mf MEMORY <= YES NO 0 0
...
# >#< starts a comment but comments are not saved across edits --------
Not too sure if I can answer the rest of your questions properly, but when canu starts running I can see the following:
$ qstat
job-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
3192050 0.58333 cormhap_mo ans74 r 03/12/2019 17:20:17 amd.q@node009.cm.cluster 1 1
3192050 0.54167 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node009.cm.cluster 1 2
3192050 0.52778 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node009.cm.cluster 1 3
3192050 0.52083 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node009.cm.cluster 1 4
3192050 0.51667 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node009.cm.cluster 1 5
3192050 0.51389 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node009.cm.cluster 1 6
3192050 0.51190 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node009.cm.cluster 1 7
3192050 0.51042 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node009.cm.cluster 1 8
3192050 0.50926 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node009.cm.cluster 1 9
3192050 0.50833 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node009.cm.cluster 1 10
3192051 0.50758 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node009.cm.cluster 1 1
3192051 0.50694 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node009.cm.cluster 1 2
3192051 0.50641 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node009.cm.cluster 1 3
3192051 0.50595 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node009.cm.cluster 1 4
3192051 0.50556 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node009.cm.cluster 1 5
3192051 0.50521 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node009.cm.cluster 1 6
3192051 0.50490 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node009.cm.cluster 1 7
3192051 0.50463 cormhap_mo ans74 r 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 8
3192051 0.50439 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 9
3192051 0.50417 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 10
3192052 0.50397 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 1
3192052 0.50379 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 2
3192052 0.50362 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 3
3192052 0.50347 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 4
3192052 0.50333 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 5
3192052 0.50321 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 6
3192052 0.50309 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 7
3192052 0.50298 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 8
3192052 0.50287 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 9
3192052 0.50278 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 10
3192053 0.50269 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 1
3192053 0.50260 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 2
3192053 0.50253 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 3
3192053 0.50245 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 4
3192053 0.50238 cormhap_mo ans74 t 03/12/2019 17:20:17 amd.q@node008.cm.cluster 1 5
3192054 0.00000 canu_moncl ans74 hqw 03/12/2019 17:20:03 1
And these nodes have: node008 (amd.q) - 64 Cores 4 x AMD Opteron(tm) Processor 6376 256GB RAM node009 (amd.q) - 64 Cores 4 x AMD Opteron(tm) Processor 6376 256GB RAM
Hope this helps!
I vaguely remember encountering something like this when I was trying to run many many little assemblies at the same time. I think that even though, like you've done, I was within the memory limits of the machine, I couldn't get java to initialize.
As a test, try submitting only one or two jobs to the grid. The submit commands are saved in the *jobSubmit*sh
commands; just change the -t parameter to a single digit or 1-2
or something like that.
If that works, change src/pipelines/canu/Execution.pm at line 908 from:
if (uc(getGlobal("gridEngine")) eq "SGE") {
$m /= $t;
}
to
#if (uc(getGlobal("gridEngine")) eq "SGE") {
# $m /= $t;
#}
This changes the way canu requests memory, from 'memory per slot' (a consumable of YES) to 'memory per job' (a consumable of JOB, which your grid is using). You can now increase mhapThreads to 8, which will limit you to at most 8 jobs per machine, instead of the ~20 you're getting now.
The other thing that could be happening is that other jobs on the same nodes are sucking up all the memory without accounting for it. Increasing the thread count will help here too, but just because it reduces the number of jobs per machine.
Hi @brianwalenz
Ran ./canu/Linux-amd64/bin/canu -p metagenomes -d /scratch/ genomeSize=0.5g -nanopore-raw metagenomes_folder/metagenomes.fastq java="/cm/shared/apps/java/jdk1.8.0_40/bin/java" gridEngineResourceOption="-pe multithread THREADS -l h_vmem=MEMORY" merylMemory=10 merylThreads=1 cormhapMemory=10 cormhapThreads=1 mhapThreads=8 gridOptionsExecutive="-l h_vmem=100g" gridEngineArrayMaxJobs=5 gridOptions="-S /bin/bash"
And still got the same error:
...
-- Mhap precompute jobs failed, tried 2 times, giving up.
-- job correction/1-overlapper/blocks/000001.dat FAILED.
-- job correction/1-overlapper/blocks/000002.dat FAILED.
-- job correction/1-overlapper/blocks/000003.dat FAILED.
-- job correction/1-overlapper/blocks/000004.dat FAILED.
-- job correction/1-overlapper/blocks/000005.dat FAILED.
-- job correction/1-overlapper/blocks/000006.dat FAILED.
-- job correction/1-overlapper/blocks/000007.dat FAILED.
-- job correction/1-overlapper/blocks/000008.dat FAILED.
-- job correction/1-overlapper/blocks/000009.dat FAILED.
-- job correction/1-overlapper/blocks/000010.dat FAILED.
-- job correction/1-overlapper/blocks/000011.dat FAILED.
-- job correction/1-overlapper/blocks/000012.dat FAILED.
-- job correction/1-overlapper/blocks/000013.dat FAILED.
-- job correction/1-overlapper/blocks/000014.dat FAILED.
-- job correction/1-overlapper/blocks/000015.dat FAILED.
-- job correction/1-overlapper/blocks/000016.dat FAILED.
-- job correction/1-overlapper/blocks/000017.dat FAILED.
-- job correction/1-overlapper/blocks/000018.dat FAILED.
-- job correction/1-overlapper/blocks/000019.dat FAILED.
-- job correction/1-overlapper/blocks/000020.dat FAILED.
-- job correction/1-overlapper/blocks/000021.dat FAILED.
-- job correction/1-overlapper/blocks/000022.dat FAILED.
-- job correction/1-overlapper/blocks/000023.dat FAILED.
-- job correction/1-overlapper/blocks/000024.dat FAILED.
-- job correction/1-overlapper/blocks/000025.dat FAILED.
-- job correction/1-overlapper/blocks/000026.dat FAILED.
-- job correction/1-overlapper/blocks/000027.dat FAILED.
-- job correction/1-overlapper/blocks/000028.dat FAILED.
-- job correction/1-overlapper/blocks/000029.dat FAILED.
-- job correction/1-overlapper/blocks/000030.dat FAILED.
-- job correction/1-overlapper/blocks/000031.dat FAILED.
-- job correction/1-overlapper/blocks/000032.dat FAILED.
-- job correction/1-overlapper/blocks/000033.dat FAILED.
-- job correction/1-overlapper/blocks/000034.dat FAILED.
-- job correction/1-overlapper/blocks/000035.dat FAILED.
--
ABORT:
ABORT: Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting. If that doesn't work, ask for help.
ABORT:
I don't think the jobs picked up the new parameters, remove the 1-overlapper folder completely and re-run the command. If it still fails post the precompute.sh and precompute.jobSubmit-01.sh files.
Hi again @skoren , it did fail again, here's the files you asked for:
$ cat /ibers/ernie/scratch/ans74/correction/1-overlapper/precompute.sh
#!/bin/sh
# Path to Canu.
syst=`uname -s`
arch=`uname -m | sed s/x86_64/amd64/`
bin="/ibers/ernie/home/ans74/canu/$syst-$arch/bin"
if [ ! -d "$bin" ] ; then
bin="/ibers/ernie/home/ans74/canu"
fi
# Report paths.
echo ""
echo "Found perl:"
echo " " `which perl`
echo " " `perl --version | grep version`
echo ""
echo "Found java:"
echo " " `which /cm/shared/apps/java/jdk1.8.0_40/bin/java`
echo " " `/cm/shared/apps/java/jdk1.8.0_40/bin/java -showversion 2>&1 | head -n 1`
echo ""
echo "Found canu:"
echo " " $bin/canu
echo " " `$bin/canu -version`
echo ""
# Environment for any object storage.
export CANU_OBJECT_STORE_CLIENT=
export CANU_OBJECT_STORE_CLIENT_UA=
export CANU_OBJECT_STORE_CLIENT_DA=
export CANU_OBJECT_STORE_NAMESPACE=moncler
export CANU_OBJECT_STORE_PROJECT=
# Discover the job ID to run, from either a grid environment variable and a
# command line offset, or directly from the command line.
#
if [ x$SGE_TASK_ID = x -o x$SGE_TASK_ID = xundefined -o x$SGE_TASK_ID = x0 ]; then
baseid=$1
offset=0
else
baseid=$SGE_TASK_ID
offset=$1
fi
if [ x$offset = x ]; then
offset=0
fi
if [ x$baseid = x ]; then
echo Error: I need SGE_TASK_ID set, or a job index on the command line.
exit
fi
jobid=`expr $baseid + $offset`
if [ x$SGE_TASK_ID = x ]; then
echo Running job $jobid based on command line options.
else
echo Running job $jobid based on SGE_TASK_ID=$SGE_TASK_ID and offset=$offset.
fi
if [ $jobid -eq 1 ] ; then
rge="-b 1 -e 13500"
job="000001"
fi
if [ $jobid -eq 2 ] ; then
rge="-b 13501 -e 27000"
job="000002"
fi
if [ $jobid -eq 3 ] ; then
rge="-b 27001 -e 40500"
job="000003"
fi
if [ $jobid -eq 4 ] ; then
rge="-b 40501 -e 54000"
job="000004"
fi
if [ $jobid -eq 5 ] ; then
rge="-b 54001 -e 67500"
job="000005"
fi
if [ $jobid -eq 6 ] ; then
rge="-b 67501 -e 81000"
job="000006"
fi
if [ $jobid -eq 7 ] ; then
rge="-b 81001 -e 94500"
job="000007"
fi
if [ $jobid -eq 8 ] ; then
rge="-b 94501 -e 108000"
job="000008"
fi
if [ $jobid -eq 9 ] ; then
rge="-b 108001 -e 121500"
job="000009"
fi
if [ $jobid -eq 10 ] ; then
rge="-b 121501 -e 135000"
job="000010"
fi
if [ $jobid -eq 11 ] ; then
rge="-b 135001 -e 148500"
job="000011"
fi
if [ $jobid -eq 12 ] ; then
rge="-b 148501 -e 162000"
job="000012"
fi
if [ $jobid -eq 13 ] ; then
rge="-b 162001 -e 175500"
job="000013"
fi
if [ $jobid -eq 14 ] ; then
rge="-b 175501 -e 189000"
job="000014"
fi
if [ $jobid -eq 15 ] ; then
rge="-b 189001 -e 202500"
job="000015"
fi
if [ $jobid -eq 16 ] ; then
rge="-b 202501 -e 216000"
job="000016"
fi
if [ $jobid -eq 17 ] ; then
rge="-b 216001 -e 229500"
job="000017"
fi
if [ $jobid -eq 18 ] ; then
rge="-b 229501 -e 243000"
job="000018"
fi
if [ $jobid -eq 19 ] ; then
rge="-b 243001 -e 256500"
job="000019"
fi
if [ $jobid -eq 20 ] ; then
rge="-b 256501 -e 270000"
job="000020"
fi
if [ $jobid -eq 21 ] ; then
rge="-b 270001 -e 283500"
job="000021"
fi
if [ $jobid -eq 22 ] ; then
rge="-b 283501 -e 297000"
job="000022"
fi
if [ $jobid -eq 23 ] ; then
rge="-b 297001 -e 310500"
job="000023"
fi
if [ $jobid -eq 24 ] ; then
rge="-b 310501 -e 324000"
job="000024"
fi
if [ $jobid -eq 25 ] ; then
rge="-b 324001 -e 337500"
job="000025"
fi
if [ $jobid -eq 26 ] ; then
rge="-b 337501 -e 351000"
job="000026"
fi
if [ $jobid -eq 27 ] ; then
rge="-b 351001 -e 364500"
job="000027"
fi
if [ $jobid -eq 28 ] ; then
rge="-b 364501 -e 378000"
job="000028"
fi
if [ $jobid -eq 29 ] ; then
rge="-b 378001 -e 391500"
job="000029"
fi
if [ $jobid -eq 30 ] ; then
rge="-b 391501 -e 405000"
job="000030"
fi
if [ $jobid -eq 31 ] ; then
rge="-b 405001 -e 418500"
job="000031"
fi
if [ $jobid -eq 32 ] ; then
rge="-b 418501 -e 432000"
job="000032"
fi
if [ $jobid -eq 33 ] ; then
rge="-b 432001 -e 445500"
job="000033"
fi
if [ $jobid -eq 34 ] ; then
rge="-b 445501 -e 459000"
job="000034"
fi
if [ $jobid -eq 35 ] ; then
rge="-b 459001 -e 459052"
job="000035"
fi
if [ x$job = x ] ; then
echo Job partitioning error. jobid $jobid is invalid.
exit 1
fi
if [ ! -d ./blocks ]; then
mkdir -p ./blocks
fi
if [ -e blocks/$job.dat ]; then
exists=true
else
exists=false
fi
if [ $exists = true ] ; then
echo Job previously completed successfully.
exit
fi
# Remove any previous result.
rm -f ./blocks/$job.input.dat
$bin/sqStoreDumpFASTQ \
-S ../../moncler.seqStore \
$rge \
-nolibname \
-noreadname \
-fasta \
-o ./blocks/$job.input \
|| \
mv -f ./blocks/$job.input.fasta ./blocks/$job.input.fasta.FAILED
if [ ! -e ./blocks/$job.input.fasta ] ; then
echo Failed to extract fasta.
exit 1
fi
echo ""
echo Starting mhap precompute.
echo ""
# So mhap writes its output in the correct spot.
cd ./blocks
/cm/shared/apps/java/jdk1.8.0_40/bin/java -d64 -XX:ParallelGCThreads=8 -server -Xms9216m -Xmx9216m \
-jar $bin/../share/java/classes/mhap-2.1.3.jar \
--repeat-weight 0.9 --repeat-idf-scale 10 -k 16 \
--store-full-id \
--num-hashes 768 \
--num-min-matches 2 \
--ordered-sketch-size 1536 \
--ordered-kmer-size 12 \
--threshold 0.78 \
--filter-threshold 0.0000001 \
--min-olap-length 500 \
--num-threads 8 \
-f ../../0-mercounts/moncler.ms16.ignore.gz \
-p ./$job.input.fasta \
-q . \
&& \
mv -f ./$job.input.dat ./$job.dat
if [ ! -e ./$job.dat ] ; then
echo Mhap failed.
exit 1
fi
# Clean up, remove the fasta input
rm -f ./$job.input.fasta
exit 0
$ cat /ibers/ernie/scratch/ans74/correction/1-overlapper/precompute.jobSubmit-01.sh
#!/bin/sh
qsub \
-pe multithread 8 -l h_vmem=1280m -S /bin/bash -j y -o precompute.\$TASK_ID.out \
-cwd -N "cormhap_moncler" \
-t 1-5 \
./precompute.sh 0 \
> ./precompute.jobSubmit-01.out 2>&1
The parameters canu is passing are correct but I think your grid is still not handling the resource request correctly. How many jobs end up on the same node now?
@skoren yeah I think not too many, as ideally expected no?
$ qstat
job-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
3192224 1.01667 cormhap_mo ans74 r 03/13/2019 15:40:02 amd.q@node006.cm.cluster 8 2
3192224 0.80833 cormhap_mo ans74 r 03/13/2019 15:40:02 amd.q@node006.cm.cluster 8 3
3192224 0.73889 cormhap_mo ans74 r 03/13/2019 15:40:02 amd.q@node006.cm.cluster 8 4
3192224 0.70417 cormhap_mo ans74 r 03/13/2019 15:40:02 amd.q@node007.cm.cluster 8 5
3192225 0.68333 cormhap_mo ans74 r 03/13/2019 15:40:02 amd.q@node007.cm.cluster 8 1
3192225 0.66944 cormhap_mo ans74 r 03/13/2019 15:40:02 amd.q@node008.cm.cluster 8 2
3192225 0.65952 cormhap_mo ans74 r 03/13/2019 15:40:02 amd.q@node008.cm.cluster 8 3
3192225 0.65208 cormhap_mo ans74 r 03/13/2019 15:40:02 amd.q@node008.cm.cluster 8 4
3192225 0.64630 cormhap_mo ans74 r 03/13/2019 15:40:02 amd.q@node008.cm.cluster 8 5
3192226 0.64167 cormhap_mo ans74 r 03/13/2019 15:40:02 amd.q@node008.cm.cluster 8 1
3192226 0.63472 cormhap_mo ans74 r 03/13/2019 15:40:02 amd.q@node011.cm.cluster 8 3
3192226 0.63205 cormhap_mo ans74 r 03/13/2019 15:40:02 amd.q@node011.cm.cluster 8 4
3192226 0.62976 cormhap_mo ans74 r 03/13/2019 15:40:02 amd.q@node011.cm.cluster 8 5
3192227 0.62778 cormhap_mo ans74 r 03/13/2019 15:40:02 amd.q@node011.cm.cluster 8 1
3192227 0.62451 cormhap_mo ans74 qw 03/13/2019 15:39:37 8 3-5:1
3192228 0.00000 canu_moncl ans74 hqw 03/13/2019 15:39:37 1
It still fails after this btw, in the same way as before and even after removing correction.
Yes the number per machine seems OK. Does your grid enforce memory limits on submitted jobs and kill them if h_vmem is exceeded? What does qacct report for one of the failed jobs and is the output in the precompute.*.out still the same?
Hi again,
The grid does cancel jobs if memory limits are exceeded.
Checked qacct on the last cormhap job running before the whole thing got cancelled:
$ qacct -j 3192235
==============================================================
qname amd.q
hostname node008.cm.cluster
group users
owner ans74
project NONE
department defaultdepartment
jobname cormhap_moncler
jobnumber 3192235
taskid 1
account sge
priority 0
qsub_time Wed Mar 13 15:40:38 2019
start_time Wed Mar 13 15:41:02 2019
end_time Wed Mar 13 15:41:02 2019
granted_pe multithread
slots 8
failed 0
exit_status 1
ru_wallclock 0
ru_utime 0.244
ru_stime 0.159
ru_maxrss 19496
ru_ixrss 0
ru_ismrss 0
ru_idrss 0
ru_isrss 0
ru_minflt 15051
ru_majflt 0
ru_nswap 0
ru_inblock 0
ru_oublock 82976
ru_msgsnd 0
ru_msgrcv 0
ru_nsignals 0
ru_nvcsw 1624
ru_nivcsw 1316
cpu 0.403
mem 0.000
io 0.000
iow 0.000
maxvmem 0.000
arid undefined
==============================================================
qname amd.q
hostname node008.cm.cluster
group users
owner ans74
project NONE
department defaultdepartment
jobname cormhap_moncler
jobnumber 3192235
taskid 2
account sge
priority 0
qsub_time Wed Mar 13 15:40:38 2019
start_time Wed Mar 13 15:41:02 2019
end_time Wed Mar 13 15:41:03 2019
granted_pe multithread
slots 8
failed 0
exit_status 1
ru_wallclock 1
ru_utime 0.263
ru_stime 0.142
ru_maxrss 19564
ru_ixrss 0
ru_ismrss 0
ru_idrss 0
ru_isrss 0
ru_minflt 15068
ru_majflt 0
ru_nswap 0
ru_inblock 0
ru_oublock 82984
ru_msgsnd 0
ru_msgrcv 0
ru_nsignals 0
ru_nvcsw 2180
ru_nivcsw 1319
cpu 0.405
mem 0.003
io 0.070
iow 0.000
maxvmem 44.680M
arid undefined
==============================================================
qname amd.q
hostname node011.cm.cluster
group users
owner ans74
project NONE
department defaultdepartment
jobname cormhap_moncler
jobnumber 3192235
taskid 3
account sge
priority 0
qsub_time Wed Mar 13 15:40:38 2019
start_time Wed Mar 13 15:41:17 2019
end_time Wed Mar 13 15:41:17 2019
granted_pe multithread
slots 8
failed 0
exit_status 1
ru_wallclock 0
ru_utime 0.251
ru_stime 0.143
ru_maxrss 19540
ru_ixrss 0
ru_ismrss 0
ru_idrss 0
ru_isrss 0
ru_minflt 15062
ru_majflt 0
ru_nswap 0
ru_inblock 0
ru_oublock 82872
ru_msgsnd 0
ru_msgrcv 0
ru_nsignals 0
ru_nvcsw 1671
ru_nivcsw 1316
cpu 0.394
mem 0.000
io 0.000
iow 0.000
maxvmem 0.000
arid undefined
==============================================================
qname amd.q
hostname node011.cm.cluster
group users
owner ans74
project NONE
department defaultdepartment
jobname cormhap_moncler
jobnumber 3192235
taskid 4
account sge
priority 0
qsub_time Wed Mar 13 15:40:38 2019
start_time Wed Mar 13 15:41:17 2019
end_time Wed Mar 13 15:41:17 2019
granted_pe multithread
slots 8
failed 0
exit_status 1
ru_wallclock 0
ru_utime 0.254
ru_stime 0.144
ru_maxrss 19552
ru_ixrss 0
ru_ismrss 0
ru_idrss 0
ru_isrss 0
ru_minflt 15065
ru_majflt 0
ru_nswap 0
ru_inblock 0
ru_oublock 80728
ru_msgsnd 0
ru_msgrcv 0
ru_nsignals 0
ru_nvcsw 1911
ru_nivcsw 1285
cpu 0.398
mem 0.000
io 0.000
iow 0.000
maxvmem 0.000
arid undefined
==============================================================
qname amd.q
hostname node011.cm.cluster
group users
owner ans74
project NONE
department defaultdepartment
jobname cormhap_moncler
jobnumber 3192235
taskid 5
account sge
priority 0
qsub_time Wed Mar 13 15:40:38 2019
start_time Wed Mar 13 15:41:17 2019
end_time Wed Mar 13 15:41:17 2019
granted_pe multithread
slots 8
failed 0
exit_status 1
ru_wallclock 0
ru_utime 0.138
ru_stime 0.088
ru_maxrss 19432
ru_ixrss 0
ru_ismrss 0
ru_idrss 0
ru_isrss 0
ru_minflt 15035
ru_majflt 0
ru_nswap 0
ru_inblock 0
ru_oublock 576
ru_msgsnd 0
ru_msgrcv 0
ru_nsignals 0
ru_nvcsw 975
ru_nivcsw 32
cpu 0.226
mem 0.000
io 0.000
iow 0.000
maxvmem 0.000
arid undefined
And yes, error on precompute.*.out is still the same:
...
==> /ibers/ernie/scratch/ans74/correction/1-overlapper/precompute.5.out <==
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 35 based on SGE_TASK_ID=5 and offset=30.
Dumping reads from 459001 to 459052 (inclusive).
Starting mhap precompute.
Error occurred during initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
...
I think the issue is the h_vmem request and @brianwalenz's previous comment. Canu is requesting memory per core but your grid expects it to be per job, so as soon as it tries to allocate 10gb it gets killed since that's >1gb it requested. That doesn't explain why it wasn't running with the previous setting of 1 core and 10gb memory. I suspect your JVM is also configured to start up with too much overhead memory on top of the user-specified request. Edit the precompute.jobSubmit-01.sh file to be:
#!/bin/sh
qsub \
-pe multithread 8 -l h_vmem=20800m -S /bin/bash -j y -o precompute.\$TASK_ID.out \
-cwd -N "cormhap_moncler" \
-t 1 \
./precompute.sh 0 \
> ./precompute.jobSubmit-01.out 2>&1
and run it to see if that job runs successfully. If that works, you need to make the code change Brian suggested earlier, otherwise all your jobs will get killed for incorrect memory requests (or you can run everything with 1 core). You also need to explicitly request more memory for the java processes to compensate for the JVM configuration (gridOptionscormhap="-h_vmem=20g
).
Ok then, modified precompute.jobSubmit-01.sh, ran it, and it seems despite java errors it still ran?
$ cat precompute.1.out
Found perl:
/usr/bin/perl
Found java:
/cm/shared/apps/java/jdk1.8.0_40/bin/java
Error occurred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
M
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
M
Found canu:
/ibers/ernie/home/ans74Running job 6 based on SGE_Use of implicit split toDumping reads from 67501 to 81000 (inclusive).
nu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
_TASK_ID=1 and offset=0.
Dumping reads from 1 to 13500 (inclusive).
a
Error occurred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
TASK_ID=1 and offset=5.
Dumping reads from 67501 to 81000 (inclusive).
ng initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
iled.
Mhap failed.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 1 based on SGE_TASK_ID=1 and offset=0.
Dumping reads from 1 to 13500 (inclusive).
Starting mhap precompute.
Error occurred during initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Found perl:
/usr/bin/perl
Found java:
/cm/shared/apps/java/jdk1.8.0_40/bin/java
Found perl:
rred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
java
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Error occurred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Grid_Cloud.pm line 74.
5.
Dumping reads from 202501 to 216000 (inclusive).
26004b26f21686f0c324ab69d37c5a885)
Running job 11 based on SGE_TASK_ID=1 and offset=10.
Dumping reads from 135001 to 148500 (inclusive).
rs/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
iled.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 11 based on SGE_TASK_ID=1 and offset=10.
Dumping reads from 135001 to 148500 (inclusive).
initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Starting mhap precompute.
Error occurred during initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Found perl:
/usr/bin/perl
Found java:
/cm/shared/apps/java/jdk1.8.0_40/bin/java
Found perl:
rred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linu Error occurred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
ccurred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/LiDumping reads from 202501 to 216000 (inclusive).
deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 21 based on SGE_TASK_ID=1 and offset=20.
Dumping reads from 270001 to 283500 (inclusive).
rs/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
iled.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
rve enough space for 9437184KB object heap
Mhap failed.
Running job 21 based on SGE_TASK_ID=1 and offset=20.
Dumping reads from 270001 to 283500 (inclusive).
Starting mhap precompute.
Error occurred during initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Found perl:
/usr/bin/perl
Found java:
/cm/shared/apps/java/jdk1.8.0_40/bin/java
Error occurred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 26 based on SGE_TASK_ID=1 and offset=25.
Dumping reads from 337501 to 351000 (inclusive).
/cm/shared/apps/java/jdk1.8.0_40/bin/java
Error occurred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 31 based on SGE_TASK_ID=1 and offset=30.
Dumping reads from 405001 to 418500 (inclusive).
25.
Dumping reads from 337501 to 351000 (inclusive).
initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Starting mhap precompute.
Error occurred during initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Starting mhap precompute.
Error occurred during initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Found perl:
/usr/bin/perl
Found java:
/cm/shared/apps/java/jdk1.8.0_40/bin/java
Error occurred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 31 based on SGE_TASK_ID=1 and offset=30.
Dumping reads from 405001 to 418500 (inclusive).
Starting mhap precompute.
Error occurred during initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Found perl:
/usr/bin/perl
Found java:
/cm/shared/apps/java/jdk1.8.0_40/bin/java
Error occurred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/.. Error occurred during initialization of
Found perl:
u:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
-amd64/bin/canu
Running job 6 based on SGE_TASK_ID=1 and offset=5.
Dumping reads from 67501 to 81000 (inclusive).
bers/ernie/home/ans74/canu/Linu Error occurred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
26f21686f0c324ab69d37c5a885)
Running job 1 based on SGE_TASK_ID=1 and offset=0.
Dumping reads from 1 to 13500 (inclusive).
fset=5.
Dumping reads from 67501 to 81000 (inclusive).
ng initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
iled.
Mhap failed.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 1 based on SGE_TASK_ID=1 and offset=0.
Dumping reads from 1 to 13500 (inclusive).
Starting mhap precompute.
Error occurred during initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Found perl:
/usr/bin/perl
Found java:
/cm/shared/apps/java/jdk1.8.0_40/bin/java
/usr/bin/perl
during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/ Error occurred durUse of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
Error occurred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
+151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 16 based on SGE_TASK_ID=1 and offset=15.
Dumping reads from 202501 to 216000 (inclusive).
10.
Dumping reads from 135001 to 148500 (inclusive).
rs/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
iled.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
rve enough space for 9437184KB object heap
Mhap failed.
Running job 11 based on SGE_TASK_ID=1 and offset=10.
Dumping reads from 135001 to 148500 (inclusive).
Starting mhap precompute.
Error occurred during initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Found perl:
/usr/bin/perl
Found java:
/cm/shared/apps/java/jdk1.8.0_40/bin/java
Error occurred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Clou Error occurred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
6 based on SGE_TASK_ID=1 and offset=15.
Dumping reads from 202501 to 216000 (inclusive).
rs/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 21 based on SGE_TASK_ID=1 and offset=20.
Dumping reads from 270001 to 283500 (inclusive).
rs/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
iled.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
rve enough space for 9437184KB object heap
Mhap failed.
Running job 21 based on SGE_TASK_ID=1 and offset=20.
Dumping reads from 270001 to 283500 (inclusive).
Starting mhap precompute.
Error occurred during initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Found perl:
/usr/bin/perl
Found java:
/cm/shared/apps/java/jdk1.8.0_40/bin/java
Error occurred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/. Error occurred during initialization of VM
Found canu:
/ibers/ernie/hom Error occurred during initializUse of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_peDumping reads from 337501 to 351000 (inclusive).
rs/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 26 based on SGE_TASK_ID=1 and offset=25.
Dumping reads from 337501 to 351000 (inclusive).
26004b26f21686f0c324ab69d37c5a885)
Running job 31 based on SGE_TASK_ID=1 and offset=30.
Dumping reads from 405001 to 418500 (inclusive).
initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Starting mhap precompute.
Error occurred during initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Starting mhap precompute.
Error occurred during initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Found perl:
/usr/bin/perl
Found java:
/cm/shared/apps/java/jdk1.8.0_40/bin/java
Error occurred during initialization of VM
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 31 based on SGE_TASK_ID=1 and offset=30.
Dumping reads from 405001 to 418500 (inclusive).
Starting mhap precompute.
Error occurred during initialization of VM
Could not reserve enough space for 9437184KB object heap
Mhap failed.
Found perl:
/usr/bin/perl
Found java:
/cm/shared/apps/java/jdk1.8.0_40/bin/java
java version "1.8.0_40"
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 1 based on SGE_TASK_ID=1 and offset=0.
Dumping reads from 1 to 13500 (inclusive).
Starting mhap precompute.
Running with these settings:
--filter-threshold = 1.0E-7
--help = false
--max-shift = 0.2
--min-olap-length = 500
--min-store-length = 0
--no-rc = false
--no-self = false
--no-tf = false
--num-hashes = 768
--num-min-matches = 2
--num-threads = 8
--ordered-kmer-size = 12
--ordered-sketch-size = 1536
--repeat-idf-scale = 10.0
--repeat-weight = 0.9
--settings = 0
--store-full-id = true
--supress-noise = 0
--threshold = 0.78
--version = false
-f = ../../0-mercounts/moncler.ms16.ignore.gz
-h = false
-k = 16
-p = ./000001.input.fasta
-q = .
-s =
Reading in filter file ../../0-mercounts/moncler.ms16.ignore.gz.
Read in values for repeat 74668 and 0
Warning, k-mer filter file has zero elements.
Initializing
Trying size 74668 and it was successfull
Initialized
Time (s) to read filter file: 2.398091821
Read in k-mer filter for sizes: [16]
Processing FASTA files for binary compression...
Current # sequences loaded and processed from file: 5000...
Current # sequences loaded and processed from file: 10000...
Current # sequences loaded and processed from file: 15000...
Current # sequences loaded and processed from file: 20000...
Current # sequences loaded and processed from file: 25000...
Processed 27000 sequences (fwd and rev).
Read, hashed, and stored file ./000001.input.fasta to ./000001.input.dat.
Time (s): 482.01313936300005
Total time (s): 482.01364922600004
Changed the code as @brianwalenz sugested and re-ran my canu command ca. 50min ago and things seem to be going fine for now! Will let you know if anything else fails. Cheers!
SGE keeps appending to the log not overwriting so all those previous errors are from other trials. The last one was OK and ran without issue. Check qacct for the job to see what memory it actually used to see what the JVM overhead is.
@skoren the job has run indeed but gotten to Mhap overlap and failed again. Any ideas? canu.out output below:
...
-- BEGIN CORRECTION
--
--
-- Mhap overlap jobs failed, tried 2 times, giving up.
-- job correction/1-overlapper/results/000001.ovb FAILED.
-- job correction/1-overlapper/results/000002.ovb FAILED.
-- job correction/1-overlapper/results/000003.ovb FAILED.
-- job correction/1-overlapper/results/000004.ovb FAILED.
-- job correction/1-overlapper/results/000005.ovb FAILED.
-- job correction/1-overlapper/results/000006.ovb FAILED.
-- job correction/1-overlapper/results/000007.ovb FAILED.
-- job correction/1-overlapper/results/000008.ovb FAILED.
-- job correction/1-overlapper/results/000009.ovb FAILED.
-- job correction/1-overlapper/results/000010.ovb FAILED.
-- job correction/1-overlapper/results/000011.ovb FAILED.
-- job correction/1-overlapper/results/000012.ovb FAILED.
-- job correction/1-overlapper/results/000013.ovb FAILED.
-- job correction/1-overlapper/results/000014.ovb FAILED.
-- job correction/1-overlapper/results/000015.ovb FAILED.
-- job correction/1-overlapper/results/000016.ovb FAILED.
-- job correction/1-overlapper/results/000017.ovb FAILED.
-- job correction/1-overlapper/results/000018.ovb FAILED.
-- job correction/1-overlapper/results/000019.ovb FAILED.
-- job correction/1-overlapper/results/000020.ovb FAILED.
-- job correction/1-overlapper/results/000021.ovb FAILED.
-- job correction/1-overlapper/results/000022.ovb FAILED.
-- job correction/1-overlapper/results/000023.ovb FAILED.
-- job correction/1-overlapper/results/000024.ovb FAILED.
-- job correction/1-overlapper/results/000025.ovb FAILED.
-- job correction/1-overlapper/results/000026.ovb FAILED.
-- job correction/1-overlapper/results/000027.ovb FAILED.
-- job correction/1-overlapper/results/000028.ovb FAILED.
-- job correction/1-overlapper/results/000029.ovb FAILED.
-- job correction/1-overlapper/results/000030.ovb FAILED.
-- job correction/1-overlapper/results/000031.ovb FAILED.
-- job correction/1-overlapper/results/000032.ovb FAILED.
-- job correction/1-overlapper/results/000033.ovb FAILED.
-- job correction/1-overlapper/results/000034.ovb FAILED.
-- job correction/1-overlapper/results/000035.ovb FAILED.
-- job correction/1-overlapper/results/000036.ovb FAILED.
-- job correction/1-overlapper/results/000037.ovb FAILED.
-- job correction/1-overlapper/results/000038.ovb FAILED.
-- job correction/1-overlapper/results/000039.ovb FAILED.
-- job correction/1-overlapper/results/000040.ovb FAILED.
-- job correction/1-overlapper/results/000041.ovb FAILED.
-- job correction/1-overlapper/results/000042.ovb FAILED.
-- job correction/1-overlapper/results/000043.ovb FAILED.
-- job correction/1-overlapper/results/000044.ovb FAILED.
-- job correction/1-overlapper/results/000045.ovb FAILED.
-- job correction/1-overlapper/results/000046.ovb FAILED.
-- job correction/1-overlapper/results/000047.ovb FAILED.
-- job correction/1-overlapper/results/000048.ovb FAILED.
-- job correction/1-overlapper/results/000049.ovb FAILED.
-- job correction/1-overlapper/results/000050.ovb FAILED.
-- job correction/1-overlapper/results/000051.ovb FAILED.
-- job correction/1-overlapper/results/000052.ovb FAILED.
-- job correction/1-overlapper/results/000053.ovb FAILED.
-- job correction/1-overlapper/results/000054.ovb FAILED.
-- job correction/1-overlapper/results/000055.ovb FAILED.
-- job correction/1-overlapper/results/000056.ovb FAILED.
-- job correction/1-overlapper/results/000057.ovb FAILED.
-- job correction/1-overlapper/results/000058.ovb FAILED.
-- job correction/1-overlapper/results/000059.ovb FAILED.
-- job correction/1-overlapper/results/000060.ovb FAILED.
-- job correction/1-overlapper/results/000061.ovb FAILED.
-- job correction/1-overlapper/results/000062.ovb FAILED.
-- job correction/1-overlapper/results/000063.ovb FAILED.
-- job correction/1-overlapper/results/000064.ovb FAILED.
-- job correction/1-overlapper/results/000065.ovb FAILED.
-- job correction/1-overlapper/results/000066.ovb FAILED.
-- job correction/1-overlapper/results/000067.ovb FAILED.
-- job correction/1-overlapper/results/000068.ovb FAILED.
-- job correction/1-overlapper/results/000069.ovb FAILED.
-- job correction/1-overlapper/results/000070.ovb FAILED.
-- job correction/1-overlapper/results/000071.ovb FAILED.
-- job correction/1-overlapper/results/000072.ovb FAILED.
-- job correction/1-overlapper/results/000073.ovb FAILED.
-- job correction/1-overlapper/results/000074.ovb FAILED.
-- job correction/1-overlapper/results/000075.ovb FAILED.
-- job correction/1-overlapper/results/000076.ovb FAILED.
-- job correction/1-overlapper/results/000077.ovb FAILED.
-- job correction/1-overlapper/results/000078.ovb FAILED.
-- job correction/1-overlapper/results/000079.ovb FAILED.
-- job correction/1-overlapper/results/000080.ovb FAILED.
-- job correction/1-overlapper/results/000081.ovb FAILED.
-- job correction/1-overlapper/results/000082.ovb FAILED.
-- job correction/1-overlapper/results/000083.ovb FAILED.
--
ABORT:
ABORT: Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting. If that doesn't work, ask for help.
ABORT:
Yes it would fail because as I said in my previous comment, your JVM seems to be configured to use a larger overhead GC than the default.
Did you check via qacct what the precompute jobs required to run before? I know that 20gb is enough for mhap but less might be sufficient if you check qacct. You'll be able to run by adding the parameter I listed earlier: gridOptionscormhap="-h_vmem=20g
.
Hi again @skoren , the job keeps failing with gridOptionscormhap="-l h_vmem=20g"
. Tried adding gridOptionsovb="-l h_vmem=20g"
as well after deleting the correction folder and re-started the job, but keep getting the same output as above.
Current command I'm using: ./canu/Linux-amd64/bin/canu -p moncler -d /ibers/ernie/scratch/ans74/ genomeSize=0.5g -nanopore-raw montcler_metagenomes/moncler_surface.fastq java="/cm/shared/apps/java/jdk1.8.0_40/bin/java" gridEngineResourceOption="-pe multithread THREADS -l h_vmem=MEMORY" merylMemory=10 merylThreads=1 cormhapMemory=10 cormhapThreads=1 mhapThreads=8 gridOptionsExecutive="-l h_vmem=100g" gridEngineArrayMaxJobs=5 gridOptions="-S /bin/bash" gridOptionscormhap="-l h_vmem=20g" gridOptionsovb="-l h_vmem=20g"
Interesting, is it the same memory allocation error in the mhap.*.out
files? Does the mhap.jobSubmit-*.sh
file have the correct memory request with 20g? If not you can remove the mhap.*sh
files to have Canu re-generate them. You could also try increasing the memory from 20 to 30 or 40gb.
Hello again,
Seems the memory is being allocated properly now:
$ cat /ibers/ernie/scratch/ans74/correction/1-overlapper/mhap.1.out
Found perl:
/usr/bin/perl
Found java:
/cm/shared/apps/java/jdk1.8.0_40/bin/java
java version "1.8.0_40"
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/can java version "1.8.0_40"
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/can Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 6 based on SGE_TASK_ID=1 and offset=5.
Fetch blocks/000012.dat
Fetch blocks/000013.dat
Fetch blocks/000014.dat
Fetch blocks/000015.dat
Fetch blocks/000016.dat
Fetch blocks/000017.dat
Fetch blocks/000018.dat
Fetch blocks/000019.dat
Fetch blocks/000020.dat
Running block 000002 in query 000006
Running with these settings:
--filter-threshold = 1.0E-7
--help = false
--max-shift = 0.2
--min-olap-length = 500
--min-store-length = 0
--no-rc = false
--no-self = true
--no-tf = false
--num-hashes = 768
--num-min-matches = 2
--num-threads = 8
--ordered-kmer-size = 12
--ordered-sketch-size = 1536
--repeat-idf-scale = 10.0
--repeat-weight = 0.9
--settings = 0
--store-full-id = true
--suprRunning with these settings:
--filter-threshold = 1.0E-7
--help = false
--max-shift = 0.2
--min-olap-length = 500
--min-store-length = 0
--no-rc = false
--no-self = false
--no-tf = false
--num-hashes = 768
--num-min-matches = 2
--num-threads = 8
--ordered-kmer-size = 12
--ordered-sketch-size = 1536
--repeat-idf-scale = 10.0
--repeat-weight = 0.9
--settings = 0
--store-full-id = true
--supreRunning with these settings:
--filter-threshold = 1.0E-7
--help = false
--max-shift = 0.2
--min-olap-length = 500
--min-store-length = 0
--no-rc = false
--no-self = true
--no-tf = false
--num-hashes = 768
--num-min-matches = 2
--num-threads = 8
--ordered-kmer-size = 12
--ordered-sketch-size = 1536
--repeat-idf-scale = 10.0
--repeat-weight = 0.9
--settings = 0
--store-full-id = true
--supress-noise = 0
--threshold = 0.78
--version = false
-f =
-h = false
-k = 16
-p =
-q = queries/000011
-s = ./blocks/000003.dat
Processing files for storage in reverse index...
Current # sequences loaded and processed from file: 5000...
Current # sequences loaded and processed from file: 10000...
Current # sequences loaded and processed from file: 15000...
Current # sequences loaded and processed from file: 20000...
Current # sequences loaded and processed from file: 25000...
Current # sequences stored: 5000...
Current # sequences stored: 10000...
Current # sequences stored: 15000...
Current # sequences stored: 20000...
Current # sequences stored: 25000...
Stored 27000 sequences in the index.
Processed 27000 unique sequences (fwd and rev).
Time (s) to read and hash from file: 9.368463321
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000022.dat.
Current # sequences loaded and processed from file: 5000...
Current # sequences loaded and processed from file: 10000...
ta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000002.dat.
Current # sequences loaded and processed from file: 10000...
pps/java/jdk1.8.0_40/bin/java
java version "1.8.0_40"
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/canu
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 16 based on SGE_TASK_ID=1 and offset=15.
Fetch blocks/000032.dat
Fetch blocks/000033.dat
Fetch blocks/000034.dat
Fetch blocks/000035.dat
Running block 000004 in query 000016
Processed 13500 to sequences.
Time (s) to score, hash to-file, and output: 388.81848018700003
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000023.dat.
lse
--num-hashes = 768
--num-min-matches = 2
--num-threads = 8
--ordered-kmer-size = 12
--ordered-sketch-size = 1536
--repeat-idf-scale = 10.0
--repeat-weight = 0.9
--settings = 0
--store-full-id = true
--supress-noise = 0
--threshold = 0.78
--version = false
-f =
-h = false
-k = 16
-p =
-q = queries/000016
-s = ./blocks/000004.dat
Processing files for storage in reverse index...
Current # sequences loaded and processed from file: 5000...
Current # sequences loaded and processed from file: 10000...
Current # sequences loaded and processed from file: 15000...
Current # sequences loaded and processed from file: 20000...
Current # sequences loaded and processed from file: 25000...
Current # sequences stored: 5000...
Current # sequences stored: 10000...
Current # sequences stored: 15000...
Current # sequences stored: 20000...
Current # sequences stored: 25000...
Stored 27000 sequences in the index.
Processed 27000 unique sequences (fwd and rev).
Time (s) to read and hash from file: 71.597211915
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000032.dat.
Current # sequences loaded and processed from file: 5000...
e, and output: 446.66281878700005
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000013.dat.
Current # sequences loaded and processed from file: 5000...
Current # sequences loaded and processed from file: 10000...
Current # sequences loaded and processed from file: 5000...
Current # sequences loaded and processed from file: 10000...
, and output: 442.119920625
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000003.dat.
Current # sequences loaded and processed from file: 10000...
Processed 13500 to sequences.
Time (s) to score, hash to-file, and output: 396.61825338100005
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000014.dat.
Processed 13500 to sequences.
Time (s) to score, hash to-file, and output: 452.641465598
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000033.dat.
Processed 13500 to sequences.
Time (s) to score, hash to-file, and output: 525.7606683160001
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000024.dat.
Current # sequences loaded and processed from file: 5000...
Current # sequences loaded and processed from file: 5000...
Current # sequences loaded and processed from file: 10000...
Current # sequences loaded and processed from file: 5000...
Current # sequences loaded and processed from file: 10000...
, and output: 436.716022251
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000004.dat.
Current # sequences loaded and processed from file: 10000...
Processed 13500 to sequences.
Time (s) to score, hash to-file, and output: 460.181400133
Opened fasta file /ibers/ernie/sProcessed 13500 to sequences.
Time (s) to score, hash to-file, and output: 434.45696625700003
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000034.dat.
Processed 13500 to sequences.
Time (s) to score, hash to-file, and output: 478.818457217
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000025.dat.
Current # sequences loaded and processed from file: 5000...
Current # sequences loaded and processed from file: 10000...
Current # sequences loaded and processed from file: 5000...
Current # sequences loaded and processed from file: 5000...
Processed 13500 to sequences.
Time (s) to score, hash to-file, and output: 451.241507886
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000005.dat.
Current # sequences loaded and processed from file: 10000...
Processed 13500 to sequences.
Time (s) to score, hash to-file, and output: 391.620046946
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000016.dat.
Processed 13500 to sequences.
Time (s) to score, hash to-file, and output: 334.040386351
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000026.dat.
Current # sequences loaded and processed from file: 5000...
Processed 13500 to sequences.
Time (s) to score, hash to-file, and output: 481.10990021000003
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000035.dat.
Processed 52 to sequences.
Time (s) to score, hash to-file, and output: 2.126424492
Total scoring time (s): 1370.3603578260002
Total time (s): 1441.957862721
MinHash search time (s): 114.861784725
Total matches found: 57094315
Average number of matches per lookup: 1407.9284622213454
Average number of table elements processed per lookup: 6239.98387255869
Average number of table elements processed per match: 4.432031910707748
Average % of hashed sequences hit per lookup: 12.945912701022191
Average % of hashed sequences hit that are matches: 40.27950736657852
Average % of hashed sequences fully compared that are matches: 91.47507885824785
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/mhapConvert: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /ibers/ernie/home/ans74/canu/Linux-amd64/bin/mhapConvert)
Current # sequences loaded and processed from file: 5000...
Current # sequences loaded and processed from file: 10000...
Processed 13500 to sequences.
Time (s) to score, hash to-file, and output: 265.41871118
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000027.dat.
Processed 13500 to sequences.
Time (s) to score, hash to-file, and output: 429.50884629
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000006.dat.
Processed 13500 to sequences.
Time (s) to score, hash to-file, and output: 387.03043724400004
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000017.dat.
Current # sequences loaded and processed from file: 5000...
Current # sequences loaded and processed from file: 5000...
Current # sequences loaded and processed from file: 10000...
Current # sequences loaded and processed from file: 10000...
, and output: 363.21103995600004
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000028.dat.
Processed 13500 to sequences.
Time (s) to score, hash to-file, and output: 422.926341099
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000007.dat.
Current # sequences loaded and processed from file: 5000...
Current # sequences loaded and processed from file: 5000...
Current # sequences loaded and processed from file: 10000...
Current # sequences loaded and processed from file: 10000...
Current # sequences loaded and processed from file: 10000...
, and output: 403.628449631
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000029.dat.
Processed 13500 to sequences.
Time (s) to score, hash to-file, and output: 357.499472409
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000019.dat.
Current # sequences loaded and processed from file: 5000...
e, and output: 425.095174063
Opened fasta file /ibers/ernie/scratch/ans74/correction/1-overlapper/blocks/000008.dat.
...
And it seems memory is being correctly requested as well:
$ cat /ibers/ernie/scratch/ans74/correction/1-overlapper/mhap.jobSubmit-01.sh
#!/bin/sh
qsub \
-pe multithread 8 -l h_vmem=1280m -S /bin/bash -l h_vmem=20g -j y -o mhap.\$TASK_ID.out \
-cwd -N "cormhap_moncler" \
-t 1-5 \
./mhap.sh 0 \
> ./mhap.jobSubmit-01.out 2>&1
Will try with 40gb now and get back you.
If it's running with 20g then increasing won't change anything. There is no error in that posted log. What was the error you got from Canu? Is that the complete log? Is there a default time limit?
Error from canu is still the same:
$ tail /ibers/ernie/scratch/ans74/canu.out -n90
--
-- BEGIN CORRECTION
--
--
-- Mhap overlap jobs failed, tried 2 times, giving up.
-- job correction/1-overlapper/results/000001.ovb FAILED.
-- job correction/1-overlapper/results/000002.ovb FAILED.
-- job correction/1-overlapper/results/000003.ovb FAILED.
-- job correction/1-overlapper/results/000004.ovb FAILED.
-- job correction/1-overlapper/results/000005.ovb FAILED.
-- job correction/1-overlapper/results/000006.ovb FAILED.
-- job correction/1-overlapper/results/000007.ovb FAILED.
-- job correction/1-overlapper/results/000008.ovb FAILED.
-- job correction/1-overlapper/results/000009.ovb FAILED.
-- job correction/1-overlapper/results/000010.ovb FAILED.
-- job correction/1-overlapper/results/000011.ovb FAILED.
-- job correction/1-overlapper/results/000012.ovb FAILED.
-- job correction/1-overlapper/results/000013.ovb FAILED.
-- job correction/1-overlapper/results/000014.ovb FAILED.
-- job correction/1-overlapper/results/000015.ovb FAILED.
-- job correction/1-overlapper/results/000016.ovb FAILED.
-- job correction/1-overlapper/results/000017.ovb FAILED.
-- job correction/1-overlapper/results/000018.ovb FAILED.
-- job correction/1-overlapper/results/000019.ovb FAILED.
-- job correction/1-overlapper/results/000020.ovb FAILED.
-- job correction/1-overlapper/results/000021.ovb FAILED.
-- job correction/1-overlapper/results/000022.ovb FAILED.
-- job correction/1-overlapper/results/000023.ovb FAILED.
-- job correction/1-overlapper/results/000024.ovb FAILED.
-- job correction/1-overlapper/results/000025.ovb FAILED.
-- job correction/1-overlapper/results/000026.ovb FAILED.
-- job correction/1-overlapper/results/000027.ovb FAILED.
-- job correction/1-overlapper/results/000028.ovb FAILED.
-- job correction/1-overlapper/results/000029.ovb FAILED.
-- job correction/1-overlapper/results/000030.ovb FAILED.
-- job correction/1-overlapper/results/000031.ovb FAILED.
-- job correction/1-overlapper/results/000032.ovb FAILED.
-- job correction/1-overlapper/results/000033.ovb FAILED.
-- job correction/1-overlapper/results/000034.ovb FAILED.
-- job correction/1-overlapper/results/000035.ovb FAILED.
-- job correction/1-overlapper/results/000036.ovb FAILED.
-- job correction/1-overlapper/results/000037.ovb FAILED.
-- job correction/1-overlapper/results/000038.ovb FAILED.
-- job correction/1-overlapper/results/000039.ovb FAILED.
-- job correction/1-overlapper/results/000040.ovb FAILED.
-- job correction/1-overlapper/results/000041.ovb FAILED.
-- job correction/1-overlapper/results/000042.ovb FAILED.
-- job correction/1-overlapper/results/000043.ovb FAILED.
-- job correction/1-overlapper/results/000044.ovb FAILED.
-- job correction/1-overlapper/results/000045.ovb FAILED.
-- job correction/1-overlapper/results/000046.ovb FAILED.
-- job correction/1-overlapper/results/000047.ovb FAILED.
-- job correction/1-overlapper/results/000048.ovb FAILED.
-- job correction/1-overlapper/results/000049.ovb FAILED.
-- job correction/1-overlapper/results/000050.ovb FAILED.
-- job correction/1-overlapper/results/000051.ovb FAILED.
-- job correction/1-overlapper/results/000052.ovb FAILED.
-- job correction/1-overlapper/results/000053.ovb FAILED.
-- job correction/1-overlapper/results/000054.ovb FAILED.
-- job correction/1-overlapper/results/000055.ovb FAILED.
-- job correction/1-overlapper/results/000056.ovb FAILED.
-- job correction/1-overlapper/results/000057.ovb FAILED.
-- job correction/1-overlapper/results/000058.ovb FAILED.
-- job correction/1-overlapper/results/000059.ovb FAILED.
-- job correction/1-overlapper/results/000060.ovb FAILED.
-- job correction/1-overlapper/results/000061.ovb FAILED.
-- job correction/1-overlapper/results/000062.ovb FAILED.
-- job correction/1-overlapper/results/000063.ovb FAILED.
-- job correction/1-overlapper/results/000064.ovb FAILED.
-- job correction/1-overlapper/results/000065.ovb FAILED.
-- job correction/1-overlapper/results/000066.ovb FAILED.
-- job correction/1-overlapper/results/000067.ovb FAILED.
-- job correction/1-overlapper/results/000068.ovb FAILED.
-- job correction/1-overlapper/results/000069.ovb FAILED.
-- job correction/1-overlapper/results/000070.ovb FAILED.
-- job correction/1-overlapper/results/000071.ovb FAILED.
-- job correction/1-overlapper/results/000072.ovb FAILED.
-- job correction/1-overlapper/results/000073.ovb FAILED.
-- job correction/1-overlapper/results/000074.ovb FAILED.
-- job correction/1-overlapper/results/000075.ovb FAILED.
-- job correction/1-overlapper/results/000076.ovb FAILED.
-- job correction/1-overlapper/results/000077.ovb FAILED.
-- job correction/1-overlapper/results/000078.ovb FAILED.
-- job correction/1-overlapper/results/000079.ovb FAILED.
-- job correction/1-overlapper/results/000080.ovb FAILED.
-- job correction/1-overlapper/results/000081.ovb FAILED.
-- job correction/1-overlapper/results/000082.ovb FAILED.
-- job correction/1-overlapper/results/000083.ovb FAILED.
--
ABORT:
ABORT: Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting. If that doesn't work, ask for help.
ABORT:
Don't think there's a default time limit but can ask.
That was not the complete log, here's the last of it:
Found perl:
/usr/bin/perl
Found java:
/cm/shared/apps/java/jdk1.8.0_40/bin/java
java version "1.8.0_40"
Found canu:
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/can Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/.Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 74.
g job 81 based on SGE_TASK_ID=1 and offset=80.
Fetch blocks/000034.dat
Fetch blocks/000035.dat
Running block 000033 in query 000081
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/mhapConvert: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /ibers/ernie/home/ans74/canu/Linux-amd64/bin/mhapConvert)
Canu snapshot v1.8 +151 changes (r9361 db9bfcc26004b26f21686f0c324ab69d37c5a885)
Running job 76 based on SGE_TASK_ID=1 and offset=75.
Fetch blocks/000029.dat
Fetch blocks/000030.dat
Fetch blocks/000031.dat
Fetch blocks/000032.dat
Fetch blocks/000033.dat
Fetch blocks/000034.dat
Fetch blocks/000035.dat
Running block 000028 in query 000076
/ibers/ernie/home/ans74/canu/Linux-amd64/bin/mhapConvert: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /ibers/ernie/home/ans74/canu/Linux-amd64/bin/mhapConvert)
Does this mean I should load module glib?
The JVM issues are solved and thats running. However, it seems wherever you compiled Canu doesn't have the same OS version/gcc libs as wherever you're running it. Some of your cluster nodes have a different environment. You need to build Canu on the oldest system you want it to run on.
You could also use the pre-built release binary rather than building yourself, just make the change to canu-1.8/Linux-amd64/lib/site_perl/canu/Execution.pm
in the same way you did before.
@skoren done but getting a weird error:
$ ./canu-1.8/Linux-amd64/bin/canu -p moncler -d /ibers/ernie/scratch/ans74/ genomeSize=0.5g -nanopore-raw montcler_metagenomes/moncler_surface.fastq java="/cm/shared/apps/java/jdk1.8.0_40/bin/java" gridEngineResourceOption="-pe multithread THREADS -l h_vmem=MEMORY" merylMemory=10 merylThreads=1 cormhapMemory=10 cormhapThreads=1 mhapThreads=8 gridOptionsExecutive="-l h_vmem=30g" gridEngineArrayMaxJobs=5 gridOptions="-S /bin/bash" gridOptionscormhap="-l h_vmem=30g"
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu-1.8/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 73.
usage: canu [-version] [-citation] \
[-correct | -trim | -assemble | -trim-assemble] \
[-s <assembly-specifications-file>] \
-p <assembly-prefix> \
-d <assembly-directory> \
genomeSize=<number>[g|m|k] \
[other-options] \
[-pacbio-raw |
-pacbio-corrected |
-nanopore-raw |
-nanopore-corrected] file1 file2 ...
example: canu -d run1 -p godzilla genomeSize=1g -nanopore-raw reads/*.fasta.gz
To restrict canu to only a specific stage, use:
-correct - generate corrected reads
-trim - generate trimmed reads
-assemble - generate an assembly
-trim-assemble - generate trimmed reads and then assemble them
The assembly is computed in the -d <assembly-directory>, with output files named
using the -p <assembly-prefix>. This directory is created if needed. It is not
possible to run multiple assemblies in the same directory.
The genome size should be your best guess of the haploid genome size of what is being
assembled. It is used primarily to estimate coverage in reads, NOT as the desired
assembly size. Fractional values are allowed: '4.7m' equals '4700k' equals '4700000'
Some common options:
useGrid=string
- Run under grid control (true), locally (false), or set up for grid control
but don't submit any jobs (remote)
rawErrorRate=fraction-error
- The allowed difference in an overlap between two raw uncorrected reads. For lower
quality reads, use a higher number. The defaults are 0.300 for PacBio reads and
0.500 for Nanopore reads.
correctedErrorRate=fraction-error
- The allowed difference in an overlap between two corrected reads. Assemblies of
low coverage or data with biological differences will benefit from a slight increase
in this. Defaults are 0.045 for PacBio reads and 0.144 for Nanopore reads.
gridOptions=string
- Pass string to the command used to submit jobs to the grid. Can be used to set
maximum run time limits. Should NOT be used to set memory limits; Canu will do
that for you.
minReadLength=number
- Ignore reads shorter than 'number' bases long. Default: 1000.
minOverlapLength=number
- Ignore read-to-read overlaps shorter than 'number' bases long. Default: 500.
A full list of options can be printed with '-options'. All options can be supplied in
an optional sepc file with the -s option.
Reads can be either FASTA or FASTQ format, uncompressed, or compressed with gz, bz2 or xz.
Reads are specified by the technology they were generated with, and any processing performed:
-pacbio-raw <files> Reads are straight off the machine.
-pacbio-corrected <files> Reads have been corrected.
-nanopore-raw <files>
-nanopore-corrected <files>
Complete documentation at http://canu.readthedocs.org/en/latest/
ERROR: Parameter 'gridEngineResourceOption' is not known.
That option was added post 1.8 release, you need to set gridEngineThreadsOption for the threads and gridEngineMemoryOption for the h_vmem. You can always run canu -options to get a full list of supported options.
Thanks for that, I'll let it run and come back to you with news ASAP.
Update on this @skoren : Got a new error:
...
-- job 2-correction/results/0042.cns FAILED.
-- job 2-correction/results/0043.cns FAILED.
-- job 2-correction/results/0044.cns FAILED.
-- job 2-correction/results/0045.cns FAILED.
-- job 2-correction/results/0046.cns FAILED.
--
ABORT:
ABORT: Canu 1.8
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting. If that doesn't work, ask for help.
ABORT:
So re-ran with gridOptionscns="-l h_vmem=30g" gridOptionscor="-l h_vmem=30g"
added to the command. Now waiting for news.
Hi again, new error goes as follows:
--
-- BEGIN TRIMMING
--
--
-- Overlap jobs failed, tried 2 times, giving up.
-- job trimming/1-overlapper/001/000001.ovb FAILED.
-- job trimming/1-overlapper/001/000002.ovb FAILED.
-- job trimming/1-overlapper/001/000003.ovb FAILED.
-- job trimming/1-overlapper/001/000004.ovb FAILED.
-- job trimming/1-overlapper/001/000005.ovb FAILED.
--
ABORT:
ABORT: Canu 1.8
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting. If that doesn't work, ask for help.
ABORT:
After deleting all files from previous runs and running the following command (tried giving memory to the other overlap and trimming jobs):
./canu-1.8/Linux-amd64/bin/canu -p moncler -d /ibers/ernie/scratch/ans74/ genomeSize=0.5g -nanopore-raw montcler_metagenomes/moncler_surface.fastq java="/cm/shared/apps/java/jdk1.8.0_40/bin/java" gridEngineThreadsOption="-pe multithread THREADS" gridEngineMemoryOption="-l h_vmem=MEMORY" merylMemory=10 merylThreads=1 cormhapMemory=10 cormhapThreads=1 mhapThreads=8 gridOptionsExecutive="-l h_vmem=30g" gridEngineArrayMaxJobs=5 gridOptions="-S /bin/bash" gridOptionscormhap="-l h_vmem=30g" gridOptionscns="-l h_vmem=30g" gridOptionscor="-l h_vmem=30g" gridOptionscorovl="-l h_vmem=30g" gridOptionsobtovl="-l h_vmem=30g" gridOptionsobtmhap="-l h_vmem=30g" gridOptionsobtmmap="-l h_vmem=30g" stopOnLowCoverage=2
Any ideas @brianwalenz @skoren ? Thanks again!
Update:
$ cat /ibers/ernie/scratch/ans74/trimming/1-overlapper/overlap.1.out
...
Running job 1 based on SGE_TASK_ID=1 and offset=0.
STRING_NUM_BITS 31
OFFSET_BITS 31
STRING_NUM_MASK 2147483647
OFFSET_MASK 2147483647
MAX_STRING_NUM 2147483647
Hash_Mask_Bits 24
bgnHashID 1
bgnHashID 104801
Max_Hash_Data_Len 320002960
Max_Hash_Load 0.800000
Kmer Length 22
Min Overlap Length 500
Max Error Rate 0.120000
Min Kmer Matches 0
Num_PThreads 8
sizeof(Hash_Bucket_t) 216
sizeof(Check_Vector_t) 4
sizeof(Hash_Frag_Info_t) 4
HASH_TABLE_SIZE 16777216
hash table size: 3456 MB
hash check array 64 MB
string info 0 MB
string start 0 MB
Initializing 8 work areas.
libgomp: Thread creation failed: Resource temporarily unavailable
Current command: ./canu-1.8/Linux-amd64/bin/canu -p moncler -d /ibers/ernie/scratch/ans74/ genomeSize=0.5g -nanopore-raw montcler_metagenomes/moncler_surface.fastq java="/cm/shared/apps/java/jdk1.8.0_40/bin/java" gridEngineThreadsOption="-pe multithread THREADS" gridEngineMemoryOption="-l h_vmem=MEMORY" merylMemory=10 merylThreads=1 cormhapMemory=10 cormhapThreads=1 mhapThreads=8 gridOptionsExecutive="-l h_vmem=30g" gridEngineArrayMaxJobs=5 gridOptions="-S /bin/bash" gridOptionscormhap="-l h_vmem=30g" gridOptionscns="-l h_vmem=30g" gridOptionscor="-l h_vmem=30g" gridOptionscorovl="-l h_vmem=30g" gridOptionsobtovl="-l h_vmem=30g" gridOptionsobtmhap="-l h_vmem=30g" gridOptionsobtmmap="-l h_vmem=30g" gridOptionsovb="-l h_vmem=40g" gridOptionsovs="-l h_vmem=30g" ovbThreads=1 ovbMemory=30 ovsThreads=1 ovsMemory=30 stopOnLowCoverage=2
You could be hitting system/admin imposed limits on the number of processes or threads per user (e.g., ulimit
). Googling the error suggests that it could also be a limit on the stack size.
Edit overlap.sh to add something like this to the start (anywhere works, but there is 'space' left for you after paths are reported and environment variables are set).
echo "Before."
ulimit -a
ulimit -s
ulimit -u
echo "After."
ulimit -a
ulimit -a just reports the current limits. -s and -u try to increase the limit on stack size and number of processes, respectively.
Docs on ulimit: https://ss64.com/bash/ulimit.html
The program hasn't done any significant memory allocations yet, so I don't think that's the problem, but do check the submit commands just in case.
Hi there @brianwalenz , managed to get past this with:
./canu-1.8/Linux-amd64/bin/canu -p moncler -d /ibers/ernie/scratch/ans74/ \
genomeSize=0.5g stopOnLowCoverage=0 \
-nanopore-raw montcler_metagenomes/moncler_surface.fastq \
corMinCoverage=0 corOutCoverage=all corMhapSensitivity=high \
'correctedErrorRate=0.105' corMaxEvidenceCoverageLocal=10 corMaxEvidenceCoverageGlobal=10 oeaMemory=32 \
'redMemory=32' batMemory=200 \
java="/cm/shared/apps/java/jdk1.8.0_40/bin/java" \
gridEngineThreadsOption="-pe multithread THREADS" gridEngineMemoryOption="-l h_vmem=MEMORY" \
gridOptionsExecutive="-l h_vmem=30g" gridEngineArrayMaxJobs=5 gridOptions="-S /bin/bash" \
merylMemory=10 merylThreads=1 \
cormhapMemory=10 cormhapThreads=1 gridOptionscormhap="-l h_vmem=30g" \
mhapThreads=8 \
gridOptionscns="-l h_vmem=30g" \
gridOptionscor="-l h_vmem=30g" \
gridOptionscorovl="-l h_vmem=0.5g" corovlThreads=1 corovlMemory=0.5 \
gridOptionsobtovl="-l h_vmem=10g" obtovlThreads=1 obtovlMemory=10 \
gridOptionsutgovl="-l h_vmem=10g" utgovlThreads=1 utgovlMemory=10 \
gridOptionsobtmhap="-l h_vmem=30g" \
gridOptionsobtmmap="-l h_vmem=30g" \
gridOptionsovb="-l h_vmem=1g" ovbThreads=1 ovbMemory=1 \
gridOptionsovs="-l h_vmem=1g" ovsThreads=1 ovsMemory=1 \
gfaMemory=50g
I am now getting the following out of canu.out
though, without much explanation:
-- BEGIN ASSEMBLY
--
--
-- Graph alignment jobs failed, tried 2 times, giving up.
--
ABORT:
ABORT: Canu 1.8
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting. If that doesn't work, ask for help.
ABORT:
Important files that may be useful to troubleshoot below:
$ tail /ibers/ernie/scratch/ans74/unitigging/4-unitigger/alignGFA.1.out
Found perl:
/usr/bin/perl
Found java:
/cm/shared/apps/java/jdk1.8.0_40/bin/java
java version "1.8.0_40"
Found canu:
/ibers/ernie/home/ans74/canu-1.8/Linux-amd64/bin/canu
Use of implicit split to @_ is deprecated at /ibers/ernie/home/ans74/canu-1.8/Linux-amd64/bin/../lib/site_perl/canu/Grid_Cloud.pm line 73.
Canu 1.8
GFA alignments failed.
$ tail /ibers/ernie/scratch/ans74/unitigging/4-unitigger/unitigger.err -n50
==> BREAK REPEATS.
computeErrorProfiles()-- Computing error profiles for 30040 tigs, with 16 threads.
computeErrorProfiles()-- Finished.
markRepeatReads()-- Found 979 confused edges.
==> CLEANUP MISTAKES.
splitDiscontinuous()-- Tested 12238 tigs, split 7 tigs into 15 new tigs.
promoteToSingleton()-- Moved 79876 unplaced reads to singleton tigs.
dropDead()-- Modified 281 tigs. Dropped 129 first and 156 last reads, 4 tigs had both reads dropped.
splitDiscontinuous()-- Tested 12257 tigs, split 15 tigs into 31 new tigs.
promoteToSingleton()-- Moved 0 unplaced reads to singleton tigs.
==> CLEANUP GRAPH.
AssemblyGraph()-- rebuilding
AssemblyGraph()-- building reverse edges.
AssemblyGraph()-- rebuild complete.
AssemblyGraph()-- filtering edges
AssemblyGraph()-- 141846 contig edges and 0 unitig edges.
AssemblyGraph()-- 0 bubble edges and 0 repeat edges.
AssemblyGraph()-- 26850 middle contig edges filtered from 15238 reads.
AssemblyGraph()-- 0 repeat end edges filtered from 0 reads.
AssemblyGraph()-- 26850 repeat edges (not output).
AssemblyGraph()-- 2463 bubble edges.
AssemblyGraph()-- 2463 intersecting edges (from the end of a tig to somewhere else).
==> GENERATE OUTPUTS.
AssemblyGraph()-- generating './moncler.final.assembly.gfa'.
AssemblyGraph()-- Found 0 edges to unassembled contigs.
AssemblyGraph()-- 0 bubble placements
AssemblyGraph()-- 0 repeat placements
AssemblyGraph()-- Intratig edges: 0 contained 0 5' 0 3' (in both contig and unitig)
AssemblyGraph()-- Contig only edges: 28453 contained 109482 5' 109286 3'
AssemblyGraph()-- Unitig only edges: 0 contained 0 5' 0 3'
AssemblyGraph()-- Intercontig edges: 0 contained 1237 5' 1226 3' (in neither contig nor unitig)
AssemblyGraph()-- generating './moncler.contigs.gfa'.
==> GENERATE UNITIGS.
computeErrorProfiles()-- Computing error profiles for 110849 tigs, with 16 threads.
computeErrorProfiles()-- Finished.
splitDiscontinuous()-- Tested 8966 tigs, split 17 tigs into 34 new tigs.
AssemblyGraph()-- generating './moncler.unitigs.gfa'.
Bye.
Any thoughts, help? Thanks!
Probably insufficient memory, not related to the log you posted. Either edit the alignGFA.jobSubmit-01.sh to increase the memory request and run it by hand or add gridOptionsgfa='-l h_vmem=30g
to the canu options.
Thanks so much, just managed to finish running this! Had to push it to 200GB RAM but just finished :) Thanks for all the help! Closing this issue now!
Running
./canu/Linux-amd64/bin/canu -p metagenome -d /scratch/metagenome_folder genomeSize=1g -nanopore-raw metagenome_folder/metagenome.fastq
, I'm getting a Error 7 as follows:Any suggestions?
Canu version:
Ran on a Scientific Linux SGE server: