mschubert / clustermq

R package to send function calls as jobs on LSF, SGE, Slurm, PBS/Torque, or each via SSH
https://mschubert.github.io/clustermq/
Apache License 2.0
146 stars 27 forks source link

WIKI DOCS: Outdated information about PBS #117

Closed HenrikBengtsson closed 5 years ago

HenrikBengtsson commented 5 years ago

https://github.com/mschubert/clustermq/wiki/PBS mentions cores which should be n_jobs;

#PBS -N {{ job_name }}
#PBS -l select=1:ncpus={{ cores | 1 }}
#PBS -l walltime={{ walltime | 1:00:00 }}
#PBS -q default
#PBS -o {{ log_file | /dev/null }}
#PBS -j oe

ulimit -v $(( 1024 * {{ memory | 4096 }} ))
R --no-save --no-restore -e 'clustermq:::worker("{{ master }}")'
mschubert commented 5 years ago

cores is correct (but not required, the other templates don't list the number of cores per job), but n_jobs should be listed as:

#PBS -l nodes={{ n_jobs }}:ppn=1

Fixed.

HenrikBengtsson commented 5 years ago

I see. So, to be 100% sure I'm on the same page, you did indeed mean:

#PBS -l nodes={{ n_jobs }}:ppn=1

and not

#PBS -l nodes=1:ppn={{ n_jobs }}

correct? ...because ZeroMQ allows you to distribute across hosts while still working on a single host if the scheduler chooses to.

mschubert commented 5 years ago

Err, yes, copy & paste error. Good catch!

mschubert commented 5 years ago

Actually, I'm no longer sure.

#PBS -l nodes={{ n_jobs }}:ppn=1

would request n_jobs nodes with one processor and

#PBS -l nodes=1:ppn={{ n_jobs }}

would request 1 node with n_jobs processors.

Depends how this is implemented in PBS, which I don't know. And I don't think we've had any PBS users yet that could have reported on this.

mschubert commented 5 years ago

I have now merged n_jobs and cores into one line that I think (according to the documentation is correct):

#PBS -l nodes={{ n_jobs }}:ppn={{ cores | 1 }}