Nextomics / NextDenovo

Fast and accurate de novo assembler for long reads
GNU General Public License v3.0
350 stars 52 forks source link

slurmstepd: error: *** JOB CANCELLED DUE TO TIME LIMIT *** #114

Open dpaudel opened 3 years ago

dpaudel commented 3 years ago

Describe the bug I am running NextDenovo on slurm system. The job runs for around 10 minutes and gets cancelled by throwing the following error:

Error message

[ERROR] 2021-05-11 12:06:27,882 db_stat failed: please check the following logs:
[ERROR] 2021-05-11 12:06:27,910 bberry/14-nextdenovo/01_rundir/01.raw_align/01.db_stat.sh.work/dDenovo.sh.e

~/bberry/14-nextdenovo/01_rundir/01.raw_align/01.db_stat.sh.work/dDenovo.sh.e
hostname
+ hostname
cd /bberry/14-nextdenovo/01_rundir/01.raw_align/01.db_stat.sh.work/db_stat0
+ cd /orange/zdeng/dev.paudel/bberry/14-nextdenovo/01_rundir/01.raw_align/01.db_stat.sh.work/db_stat0
time /apps/nextdenovo/2.4.0/bin/seq_stat -f 3k -g 1g -d 45 -o /bberry/14-nextdenovo/01_rundir/01.put.reads.stat //bberry/14-nextdenovo/input.fofn
+ /apps/nextdenovo/2.4.0/bin/seq_stat -f 3k -g 1g -d 45 -o /bberry/14-nextdenovo/01_rundir/01.raw.reads.stat /bberry/14-nextdenovo/input.fofn
slurmstepd: error: *** JOB 867816 ON c0700a-s1 CANCELLED AT 2021-05-11T12:06:17 DUE TO TIME LIMIT ***

Genome characteristics 1g, repeat ~ 35%

Input data 160x nanopore data

Config file [General] job_type = slurm # local, slurm, sge, pbs, lsf job_prefix = nextDenovo task = all # all, correct, assemble rewrite = yes # yes/no deltmp = yes parallel_jobs = 40 # number of tasks used to run in parallel input_type = raw # raw, corrected read_type = ont # clr, ont, hifi input_fofn = input.fofn workdir = 01_rundir

[correct_option] read_cutoff = 3k genome_size = 1g # estimated genome size sort_options = -m 20g -t 15 minimap2_options_raw = -t 8 pa_correction = 3 # number of corrected tasks used to run in parallel, each corrected task requires ~TOTAL_INPUT_BASES/4 bytes of memory usage. correction_options = -p 15

[assemble_option] minimap2_options_cns = -t 8 nextgraph_options = -a 1

Operating system LSB Version: :core-4.1-amd64:core-4.1-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-4.1-amd64:desktop-4.1-noarch:languages-4.1-amd64:languages-4.1-noarch:printing-4.1-amd64:printing-4.1-noarch Distributor ID: RedHatEnterpriseServer Description: Red Hat Enterprise Linux Server release 7.7 (Maipo) Release: 7.7 Codename: Maipo

GCC gcc version 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC)

Python python/3.8

NextDenovo nextdenovo/2.4.0

Additional context (Optional) slurm-drmaa/1.2.1.20 Is there a -time option that can be included so that slurm job is submitted with the given time limit?

moold commented 3 years ago

Hi, see #48 or, if the slurm system (I do not have a slurm system, so I can not have a try) has a time limited option, you can try to set cluster_options=--cpus-per-task={cpu} --mem-per-cpu={vf} time_limited_option.