Closed itcarroll closed 5 years ago
This is normal. The executive will do some light weight computes to configure the compute intensive parallel parts for grid execution, then stop and tell you to submit jobs.
What isn't normal is:
ABORT: Requested memory 'memory=9' (GB) is more than physical memory 7.80 GB.
The meryl stage is trying to configure for a 9 GB grid job, but it fails because the machine you're on only has 8 GB. A work around is to set merylMemory=7g. For bacteria, that's way more memory than it needs.
Thanks for the rapid response! The workaround does not work, and a related question. Say I try this with genomeSize=980m, will the executive know not to do any work locally (not supposed to use a head node for long running jobs)? (Nevermind, I re-read your answer.)
The workaround fails with the same error ... does not seem to be listening to the memory request:
icarroll@sshgw02:test$ ../bin/canu genomeSize=4.8m useGrid=remote merylMemory=7g -p ecoli -d ecoli-oxford -nanopore-raw /nfs/icarroll-data/support/dhawthorne/test/oxford.fasta
-- Canu snapshot v1.8 +117 changes (r9327 dc859c7b4d2065d9412d5683e71d289af6ebf7ed)
--
-- CITATIONS
--
-- Koren S, Walenz BP, Berlin K, Miller JR, Phillippy AM.
-- Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.
-- Genome Res. 2017 May;27(5):722-736.
-- http://doi.org/10.1101/gr.215087.116
--
-- Koren S, Rhie A, Walenz BP, Dilthey AT, Bickhart DM, Kingan SB, Hiendleder S, Williams JL, Smith TPL, Phillippy AM.
-- De novo assembly of haplotype-resolved genomes with trio binning.
-- Nat Biotechnol. 2018
-- https//doi.org/10.1038/nbt.4277
--
-- Read and contig alignments during correction, consensus and GFA building use:
-- Šošic M, Šikic M.
-- Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
-- Bioinformatics. 2017 May 1;33(9):1394-1395.
-- http://doi.org/10.1093/bioinformatics/btw753
--
-- Overlaps are generated using:
-- Berlin K, et al.
-- Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
-- Nat Biotechnol. 2015 Jun;33(6):623-30.
-- http://doi.org/10.1038/nbt.3238
--
-- Myers EW, et al.
-- A Whole-Genome Assembly of Drosophila.
-- Science. 2000 Mar 24;287(5461):2196-204.
-- http://doi.org/10.1126/science.287.5461.2196
--
-- Corrected read consensus sequences are generated using an algorithm derived from FALCON-sense:
-- Chin CS, et al.
-- Phased diploid genome assembly with single-molecule real-time sequencing.
-- Nat Methods. 2016 Dec;13(12):1050-1054.
-- http://doi.org/10.1038/nmeth.4035
--
-- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
-- Chin CS, et al.
-- Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
-- Nat Methods. 2013 Jun;10(6):563-9
-- http://doi.org/10.1038/nmeth.2474
--
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '1.8.0_191' (from 'java') with -d64 support.
--
-- WARNING:
-- WARNING: Failed to run gnuplot using command 'gnuplot'.
-- WARNING: Plots will be disabled.
-- WARNING:
--
-- Detected 2 CPUs and 8 gigabytes of memory.
-- Detected Slurm with 'sinfo' binary in /usr/bin/sinfo.
-- Detected Slurm with 'MaxArraySize' limited to 1000 jobs.
--
-- Found 2 hosts with 4 cores and 9 GB memory under Slurm control.
-- Found 4 hosts with 8 cores and 121 GB memory under Slurm control.
-- Found 20 hosts with 8 cores and 59 GB memory under Slurm control.
--
-- (tag)Threads
-- (tag)Memory |
-- (tag) | | algorithm
-- ------- ------ -------- -----------------------------
-- Grid: meryl 7 GB 4 CPUs (k-mer counting)
-- Grid: hap 8 GB 4 CPUs (read-to-haplotype assignment)
-- Grid: cormhap 6 GB 4 CPUs (overlap detection with mhap)
-- Grid: obtovl 4 GB 4 CPUs (overlap detection)
-- Grid: utgovl 4 GB 4 CPUs (overlap detection)
-- Grid: cor --- GB 4 CPUs (read correction)
-- Grid: ovb 4 GB 1 CPU (overlap store bucketizer)
-- Grid: ovs 8 GB 1 CPU (overlap store sorting)
-- Grid: red 8 GB 4 CPUs (read error detection)
-- Grid: oea 4 GB 1 CPU (overlap error adjustment)
-- Grid: bat 16 GB 4 CPUs (contig construction with bogart)
-- Grid: cns --- GB 4 CPUs (consensus)
-- Grid: gfa 8 GB 4 CPUs (GFA alignment and processing)
--
-- In 'ecoli.seqStore', found Nanopore reads:
-- Raw: 20365
-- Corrected: 0
-- Trimmed: 0
--
-- Generating assembly 'ecoli' in '/research-home/icarroll/support/dhawthorne/test/ecoli-oxford'
--
-- Parameters:
--
-- genomeSize 4800000
--
-- Overlap Generation Limits:
-- corOvlErrorRate 0.3200 ( 32.00%)
-- obtOvlErrorRate 0.1200 ( 12.00%)
-- utgOvlErrorRate 0.1200 ( 12.00%)
--
-- Overlap Processing Limits:
-- corErrorRate 0.5000 ( 50.00%)
-- obtErrorRate 0.1200 ( 12.00%)
-- utgErrorRate 0.1200 ( 12.00%)
-- cnsErrorRate 0.2000 ( 20.00%)
--
--
-- BEGIN CORRECTION
--
-- segments memory batches
-- -------- -------- -------
ABORT:
ABORT: Canu snapshot v1.8 +117 changes (r9327 dc859c7b4d2065d9412d5683e71d289af6ebf7ed)
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting. If that doesn't work, ask for help.
ABORT:
ABORT: failed to parse meryl configure output 'correction/0-mercounts/ecoli.ms16.config.01.out'.
ABORT:
ABORT: Disk space available: 588.775 GB
ABORT:
ABORT: Last 50 lines of the relevant log file (correction/0-mercounts/ecoli.ms16.config.01.out):
ABORT:
ABORT: equal-to N return kmers that occur exactly N times in the input. accepts exactly one input.
ABORT: not-equal-to N return kmers that do not occur exactly N times in the input. accepts exactly one input.
ABORT:
ABORT: increase X add X to the count of each kmer.
ABORT: decrease X subtract X from the count of each kmer.
ABORT: multiply X multiply the count of each kmer by X.
ABORT: divide X divide the count of each kmer by X.
ABORT: modulo X set the count of each kmer to the remainder of the count divided by X.
ABORT:
ABORT: union return kmers that occur in any input, set the count to the number of inputs with this kmer.
ABORT: union-min return kmers that occur in any input, set the count to the minimum count
ABORT: union-max return kmers that occur in any input, set the count to the maximum count
ABORT: union-sum return kmers that occur in any input, set the count to the sum of the counts
ABORT:
ABORT: intersect return kmers that occur in all inputs, set the count to the count in the first input.
ABORT: intersect-min return kmers that occur in all inputs, set the count to the minimum count.
ABORT: intersect-max return kmers that occur in all inputs, set the count to the maximum count.
ABORT: intersect-sum return kmers that occur in all inputs, set the count to the sum of the counts.
ABORT:
ABORT: difference return kmers that occur in the first input, but none of the other inputs
ABORT: symmetric-difference return kmers that occur in exactly one input
ABORT:
ABORT: MODIFIERS:
ABORT:
ABORT: output O write kmers generated by the present command to an output meryl database O
ABORT: mandatory for count operations.
ABORT:
ABORT: EXAMPLES:
ABORT:
ABORT: Example: Report 22-mers present in at least one of input1.fasta and input2.fasta.
ABORT: Kmers from each input are saved in meryl databases 'input1' and 'input2',
ABORT: but the kmers in the union are only reported to the screen.
ABORT:
ABORT: meryl print \
ABORT: union \
ABORT: [count k=22 input1.fasta output input1] \
ABORT: [count k=22 input2.fasta output input2]
ABORT:
ABORT: Example: Find the highest count of each kmer present in both files, save the kmers to
ABORT: database 'maxCount'.
ABORT:
ABORT: meryl intersect-max input1 input2 output maxCount
ABORT:
ABORT: Example: Find unique kmers common to both files. Brackets are necessary
ABORT: on the first 'equal-to' command to prevent the second 'equal-to' from
ABORT: being used as an input to the first 'equal-to'.
ABORT:
ABORT: meryl intersect [equal-to 1 input1] equal-to 1 input2
ABORT:
ABORT: Requested memory 'memory=9' (GB) is more than physical memory 7.80 GB.
ABORT:
First, the failing memory - it found the result from the first run of 'meryl-configure.sh' (using 9gb memory). A long standing problem with Canu is that changing parameters in the middle of an assembly doesn't always work correctly. Removing 0-mercounts/ will let this be rerun.
As for head nodes and 'remote' - yup, it'll try to run some long-running jobs on the head node. In particular, there are several steps that want to process all reads or all overlaps.
An alternate strategy would be to grab an interactive node and run canu useGrid=remote on there. This would solve both the memory and time problems.
What are you trying to accomplish with useGrid=remote? Maybe we can think up a different solution.
First, the failing memory - it found the result from the first run of 'meryl-configure.sh' (using 9gb memory). A long standing problem with Canu is that changing parameters in the middle of an assembly doesn't always work correctly. Removing 0-mercounts/ will let this be rerun.
Yep, should have caught that. Working now. So I'm all set, but leaving open for the bug you flagged.
As for head nodes and 'remote' - yup, it'll try to run some long-running jobs on the head node. In particular, there are several steps that want to process all reads or all overlaps.
Okay, with the merylMemory config low enough to run on the head node, it actually doesn't take too much time. The reason I am trying useGrid=remote is to understand before submission what the resource request will be.
I (finally) fixed the meryl configuration problem. You can now happily configure it for memory larger than available on the head node.
The
useGrid=remote
parameter is ignored (see below). The default behavior (useGrid=true
) submits a job as expected.