Closed ghost closed 2 years ago
@mtakamat thanks for your interest in our work. Are you able to run with one GPU?
python main.py --batch-size 72 --epochs 501 --min-lr 5e-6 --lr 1e-3 --training-mode 'SSL' --data-set 'STL10' --output 'checkpoints/SSL/STL10' --validate-every 10
@Sara-Ahmed Thank you for your help!!
I closed this issue.
@Sara-Ahmed Thank you for sharing your wonderful achievements!
When I ran self-supervised pre-training as described, the following subprocess CalledProcessError was raised. Can you please help me how to solve this problem?
Typed command
python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --batch-size 72 --epochs 501 --min-lr 5e-6 --lr 1e-3 --training-mode 'SSL' --data-set 'STL10' --output 'checkpoints/SSL/STL10' --validate-every 10
Errors encountered
subprocess.CalledProcessError: Command '['/usr/bin/python', '-u', 'main.py', '--batch-size', '72', '--epochs', '501', '--min-lr', '5e-6', '--lr', '1e-3', '--training-mode', 'SSL', '--data-set', 'STL10', '--output', 'checkpoints/SSL/STL10', '--validate-every', '10']' returned non-zero exit status 2.