AILab-CVC / YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection
https://www.yoloworld.cc
GNU General Public License v3.0
4.4k stars 426 forks source link

torch.distributed.elastic.multiprocessing.errors.ChildFailedError: #205

Open apple32112311 opened 5 months ago

apple32112311 commented 5 months ago

When i tried to implement this instruction: ./tools/dist_train.sh configs/pretrain/yolo_world_v2_x_vlpan_bn_2e-3_100e_4x8gpus_obj365v1_goldg_train_lvis_minival.py 1 --amp I got this error, torch.distributed.elastic.multiprocessing.errors.ChildFailedError: How can i debug it ? Thanks a lot 螢幕擷取畫面 2024-04-03 015553

wondervictor commented 5 months ago

Hi @apple32112311, torch.distributed does not support 1 GPU. If you have one GPU, we suggest you:

python tools/train.py ....