State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
12.94k
stars
3.12k
forks
source link
[ELECTRA/TensorFlow2] Minor: README Invokes Slurm sbatch With Incorrect Parameter? #1325
I think that this should be "-N48": the slurm sbatch manpage has
sbatch [OPTIONS(0)...] [ : [OPTIONS(N)...]] script(0) [args(0)...]
:
-N, --nodes=<minnodes>[-maxnodes]|<size_string>
Request that a minimum of minnodes nodes be allocated to this job.
The README command as given would assume that "N48" is actually a script-name, rather than an option.
Related to ELECTRA/TensorFlow2
Describe the bug The README in the MultiNode section says
BATCHSIZE=176 LR=6e-3 GRAD_ACCUM_STEPS=1 PHASE=1 STEPS=10000 WARMUP=2000 b1=0.878 b2=0.974 decay=0.5 skip_adaptive=yes end_lr=0.0 sbatch N48 --ntasks-per-node=8 run.sub BATCHSIZE=24 LR=4e-3 GRAD_ACCUM_STEPS=3 PHASE=2 STEPS=930 WARMUP=200 b1=0.878 b2=0.974 decay=0.5 skip_adaptive=yes end_lr=0.0 sbatch N48 --ntasks-per-node=8 run.sub
I think that this should be "-N48": the slurm sbatch manpage has
The README command as given would assume that "N48" is actually a script-name, rather than an option.
To Reproduce N/A
Expected behavior N/A
Environment N/A