zhang-tao-whu / e2ec

E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation
Other
214 stars 45 forks source link

Deadlock at Epoch 20 #21

Closed chenbys closed 1 year ago

chenbys commented 1 year ago

Thanks for the excellent paper and code. But my experiments (DCN disabled) on KINS dataset always deadlock at epoch 20 with GPU and CPU busy. I use command python -m torch.distributed.launch --nproc_per_node 2 train_net_ddp.py --config_file kitti --bs 4 --gpus 2. Is there any advice to check? Thanks advance.

BTW, what are the minimum bs and epoch in KINS dataset? Bs 64 and epoch 150 seem too huge.

3

1

2

zhang-tao-whu commented 1 year ago

The current version of the code may have this bug when using ddp, and I don't know what causes it. I rebuilt E2EC based on mmdetection, using FCOS as the detector to achieve more robust performance and faster convergence. The code will be released soon.

C-Ll-l commented 1 year ago

The current version of the code may have this bug when using ddp, and I don't know what causes it. I rebuilt E2EC based on mmdetection, using FCOS as the detector to achieve more robust performance and faster convergence. The code will be released soon. Hello, I am looking forward to your latest work, and thank you for your research. I am looking forward to your latest Fcos implementation. Now the training convergence is a little slow, I would like to take the liberty of asking how long it will be uploaded, I can't wait to try. Thanks!

hannn0919 commented 1 year ago

The current version of the code may have this bug when using ddp, and I don't know what causes it. I rebuilt E2EC based on mmdetection, using FCOS as the detector to achieve more robust performance and faster convergence. The code will be released soon.

Hello, thanks for your work. I wonder when will the FCOS version be released? Thanks! :)

zhang-tao-whu commented 1 year ago

I used mmdet to re-implement e2ec, no lockups when using ddp, hope this helps you. e2ec_mmdet.