Thank you for your excellent work! I have some trouble with training:
I tried to install slurm for cluster job scheduling, but unfortunately many attempts failed. So, what we want to know is if there is any impact on training if we don't use the srun command, but execute the training script directly (for example perform ./distributed_pretrain.sh 8 '/path/to/your/dataset' ... in pre-training stage)?
Thank you for your excellent work! I have some trouble with training: I tried to install slurm for cluster job scheduling, but unfortunately many attempts failed. So, what we want to know is if there is any impact on training if we don't use the
srun
command, but execute the training script directly (for example perform./distributed_pretrain.sh 8 '/path/to/your/dataset' ...
in pre-training stage)?