facebookresearch / deit

Official DeiT repository
Apache License 2.0
4.02k stars 552 forks source link

Multinode Slurm Training #204

Closed yazdanimehdi closed 1 year ago

yazdanimehdi commented 1 year ago

Hello, I'm trying to use the run_with_submitit.py file to run the model on the Slurm cluster, but I do not get any output log file to see the training progress. All I have here are logs of each node initiating. Screenshot 2022-12-24 at 2 15 59 PM Can you please help me with this multinode training? Best regards, Mehdi