AHupuJR / EFNet

Event-based Fusion for Motion Deblurring with Cross-modal Attention (ECCV'22 Oral) https://ahupujr.github.io/EFNet/
Other
145 stars 16 forks source link

分布式训练出错 #17

Closed renliao closed 6 months ago

renliao commented 9 months ago

4卡分布式训练出错,我的机器配置为8*titan,报错信息如下:ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -11) local_rank: 0 ;torch.distributed.elastic.multiprocessing.errors.ChildFailedError。使用readme中给出的训练命令。

xjyisok commented 1 month ago

你好我也碰到了这个问题请问你解决了吗谢谢回答

xjyisok commented 1 month ago

哦只需要把train/EFNet.yml文件中的训练集路径和测试集路径改成自己的路径就行了