Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
GNU General Public License v3.0
13.02k
stars
4.12k
forks
source link
torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) #73
Open
jay985735639 opened 1 year ago
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 13077 closing signal SIGTERM ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 13076) of binary: /home/yyx/anaconda3/envs/yolov7/bin/python
python -m torch.distributed.launch --nproc_per_node 2 --master_port 9527 train.py --workers 8 --device 0,1 --sync-bn --batch-size 8 --data data/cloth_1.6.yaml --img 1280 1280 --cfg cfg/training/yolov7.yaml --weights yolov7.pt --name cloth_1.6 --hyp data/hyp.scratch.custom.yaml
There seems to be a problem with distributed train