Batch job submission failed: Access/permission denied

llq0325 commented 2 years ago

Hi there,

I am running Canu 2.2 and having a problem submitting jobs to the grid (slurm). Would you please let me know if the grid setting is wrong? and how to solve it?

The commands I used:

/dss/dsslegfs01/pr53da/pr53da-dss-0022/nobackup/private/CNV/bin/canu/canu-2.2/bin/canu -p asm -d canu_test genomeSize=0.95g \
gridOptions="--clusters=biohpc_gen --partition=biohpc_gen_production -t 10-00:00:00 --get-user-env" -maxMemory=190g -maxThreads=240 \
-pacbio-hifi /dss/dsslegfs01/pr53da/pr53da-dss-0035/2022/1.Data_QC/raw_fastq/*.ccs.fastq.gz

and the log file loos like this

   /dss/dsshome1/lrz/sys/spack/release/21.1.1/opt/x86_64/perl/5.30.3-gcc-opnd7zn/bin/perl
   This is perl 5, version 30, subversion 3 (v5.30.3) built for x86_64-linux-thread-multi

Found java:
   /dss/dsslegfs02/pn73se/pn73se-dss-0000/spack/opt/linux-sles15-skylake_avx512/jdk/16.0.2-g
cc-8.4.0-2snwgdt/bin/java
   java version "16.0.2" 2021-07-20

Found canu:
   /dss/dsslegfs01/pr53da/pr53da-dss-0022/nobackup/private/CNV/bin/canu/canu-2.2/bin/canu
   canu 2.2

-- canu 2.2
--
-- CITATIONS
--
-- For assemblies of PacBio HiFi reads:
--   Nurk S, Walenz BP, Rhiea A, Vollger MR, Logsdon GA, Grothe R, Miga KH, Eichler EE, Phil
lippy AM, Koren S.
--   HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants f
rom high-fidelity long reads.
--   biorXiv. 2020.
--   https://doi.org/10.1101/2020.03.14.992248
-- 
-- Read and contig alignments during correction and consensus use:
--   Šošic M, Šikic M.
--   Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
--   Bioinformatics. 2017 May 1;33(9):1394-1395.
--   http://doi.org/10.1093/bioinformatics/btw753
-- 
-- Overlaps are generated using:
--   Berlin K, et al.
--   Assembling large genomes with single-molecule sequencing and locality-sensitive hashing
.
--   Nat Biotechnol. 2015 Jun;33(6):623-30.
--   http://doi.org/10.1038/nbt.3238
-- 
--   Myers EW, et al.
--   A Whole-Genome Assembly of Drosophila.
--   Science. 2000 Mar 24;287(5461):2196-204.
--   http://doi.org/10.1126/science.287.5461.2196
-- 
-- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
--   Chin CS, et al.
--   Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
--   Nat Methods. 2013 Jun;10(6):563-9
--   http://doi.org/10.1038/nmeth.2474
-- 
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '16.0.2' (from '/dss/dsslegfs02/pn73se/pn73se-dss-0
000/spack/opt/linux-sles15-skylake_avx512/jdk/16.0.2-gcc-8.4.0-2snwgdt/bin/java') without -d
64 support.
--
-- WARNING:
-- WARNING:  Failed to run gnuplot using command 'gnuplot'.
-- WARNING:  Plots will be disabled.
-- WARNING:
--
--
-- Detected 1 CPUs and 4096 gigabytes of memory on the local machine.
--
-- WARNING: maxThreads=240 has no effect when only 1 CPUs present.
-- Detected Slurm with task IDs up to 1000 allowed.
-- 
-- Slurm support detected.  Resources available:
--      4 hosts with  80 cores and  687 GB memory.
--      4 hosts with  80 cores and  373 GB memory.
--      3 hosts with  80 cores and 1503 GB memory.
--      2 hosts with  72 cores and 3007 GB memory.
--
-- Job limits:
--    190 gigabytes memory  (maxMemory option).
--    240 CPUs              (maxThreads option).
--
--                         (tag)Threads
--                (tag)Memory         |
--        (tag)             |         |  algorithm
--        -------  ----------  --------  -----------------------------
-- Grid:  meryl     24.000 GB    8 CPUs  (k-mer counting)
-- Grid:  hap       12.000 GB    8 CPUs  (read-to-haplotype assignment)
-- Grid:  cormhap   32.000 GB    8 CPUs  (overlap detection with mhap)
-- Grid:  obtovl    16.000 GB    8 CPUs  (overlap detection)
-- Grid:  utgovl    16.000 GB    8 CPUs  (overlap detection)
-- Grid:  cor        -.--- GB    4 CPUs  (read correction)
-- Grid:  ovb        4.000 GB    1 CPU   (overlap store bucketizer)
-- Grid:  ovs       16.000 GB    1 CPU   (overlap store sorting)
-- Grid:  red       32.000 GB    8 CPUs  (read error detection)
-- Grid:  oea        8.000 GB    1 CPU   (overlap error adjustment)
-- Grid:  bat      190.000 GB   16 CPUs  (contig construction with bogart)
-- Grid:  cns        -.--- GB    8 CPUs  (consensus)
--
-- Found PacBio HiFi reads in 'asm.seqStore':
--   Libraries:
--     PacBio HiFi:           3
--   Reads:
--     Corrected:             102099801218
--     Corrected and Trimmed: 102099801218
--
--
-- Generating assembly 'asm' in '/dss/dsslegfs01/pr53da/pr53da-dss-0035/2022__Coturnix/3.Ass
emble/canu_test':
--   genomeSize:
--     950000000
--
--   Overlap Generation Limits:
--     corOvlErrorRate 0.0000 (  0.00%)
--     obtOvlErrorRate 0.0250 (  2.50%)
--     utgOvlErrorRate 0.0100 (  1.00%)
--
--   Overlap Processing Limits:
--     corErrorRate    0.0000 (  0.00%)
--     obtErrorRate    0.0250 (  2.50%)
--     utgErrorRate    0.0003 (  0.03%)
--     cnsErrorRate    0.0500 (  5.00%)
--
--   Stages to run:
--     assemble HiFi reads.
--
--
-- Correction skipped; not enabled.
--
-- Trimming skipped; not enabled.
--
-- BEGIN ASSEMBLY
--
-- Running jobs.  First attempt out of 2.
--
-- Failed to submit compute jobs.  Delay 10 seconds and try again.

CRASH:
CRASH: canu 2.2
CRASH: Please panic, this is abnormal.
CRASH:
CRASH:   Failed to submit compute jobs.
CRASH:
CRASH: Failed at /dss/dsslegfs01/pr53da/pr53da-dss-0022/nobackup/private/CNV/bin/canu/canu-2
.2/bin/../lib/site_perl/canu/Execution.pm line 1259.
CRASH:  canu::Execution::submitOrRunParallelJob("asm", "meryl", "unitigging/0-mercounts", "m
eryl-count", 1, 2, 3, 4, ...) called at /dss/dsslegfs01/pr53da/pr53da-dss-0022/nobackup/priv
ate/CNV/bin/canu/canu-2.2/bin/../lib/site_perl/canu/Meryl.pm line 847
CRASH:  canu::Meryl::merylCountCheck("asm", "utg") called at /dss/dsslegfs01/pr53da/pr53da-d
ss-0022/nobackup/private/CNV/bin/canu/canu-2.2/bin/canu line 1117
CRASH: 
CRASH: Last 50 lines of the relevant log file (unitigging/0-mercounts/meryl-count.jobSubmit-
01.out):
CRASH:
CRASH: sbatch: error: Batch job submission failed: Access/permission denied
CRASH:

skoren commented 2 years ago

It sounds like your cluster isn't configured to allow compute nodes to submit jobs: https://www.mail-archive.com/slurm-dev@schedmd.com/msg01762.html. This is a requirement to run Canu on the grid: https://canu.readthedocs.io/en/latest/faq.html#my-run-stopped-with-the-error-failed-to-submit-batch-jobs. I assume your cluster admin won't change this setting just to run Canu.

Given that you have hifi data and not too big a genome, I'd just run on a single node with useGrid=false. Reserve one of the 80-core 600gb nodes for Canu and let it run there, something like:

#!/bin/bash

#SBATCH --job-name=canu
#SBATCH --nodes=1
#SBATCH --cpus-per-task=80
#SBATCH --time=96:00:00
#SBATCH --mem=600GB

/dss/dsslegfs01/pr53da/pr53da-dss-0022/nobackup/private/CNV/bin/canu/canu-2.2/bin/canu -p asm -d canu_test genomeSize=0.95g useGrid=false maxMemory=600g -maxThreads=80 \
-pacbio-hifi /dss/dsslegfs01/pr53da/pr53da-dss-0035/2022/1.Data_QC/raw_fastq/*.ccs.fastq.gz

llq0325 commented 2 years ago

thanks for replying!

marbl / canu

Batch job submission failed: Access/permission denied #2092