Closed PeterJaq closed 2 months ago
We did not encounter this problem. Did you make some modification in config or the code?
We did not encounter this problem. Did you make some modification in config or the code?
Solved! The training container build and install mmcv on rtx3080. And we transfer this container to A100, the training loss is wrong. We check this issue on mmcv focal loss api, any input to sigmoid focal loss the output is 0. So we reinstall mmcv on a100, the problem solved. Thank you.
@PeterJaq , I encounter the same problem. So do you reinstall mmcv_full with another version in the training container in a100 to solve the problem?
Hi, I use release stage 1 weight finetune model on nuscenes. But I found all cls loss is equal 0
mmdet - INFO - Iter [51/87900] lr: 8.000e-05, eta: 2 days, 5:49:16, time: 2.206, data_time: 0.223, memory: 27280, det_loss_cls_0: 0.0000, det_loss_box_0: 0.9162, det_loss_cns_0: 0.6314, det_loss_yns_0: 0.0632, det_loss_cls_1: 0.0000, det_loss_box_1: 0.5736, det_loss_cns_1: 0.5961, det_loss_yns_1: 0.0322, det_loss_cls_2: 0.0000, det_loss_box_2: 0.5553, det_loss_cns_2: 0.5914, det_loss_yns_2: 0.0306, det_loss_cls_3: 0.0000, det_loss_box_3: 0.5486, det_loss_cns_3: 0.5904, det_loss_yns_3: 0.0299, det_loss_cls_4: 0.0000, det_loss_box_4: 0.5399, det_loss_cns_4: 0.5885, det_loss_yns_4: 0.0281, det_loss_cls_5: 0.0000, det_loss_box_5: 0.5376, det_loss_cns_5: 0.5885, det_loss_yns_5: 0.0285, map_loss_cls_0: 0.0000, map_loss_line_0: 0.6851, map_loss_cls_1: 0.0000, map_loss_line_1: 0.7943, map_loss_cls_2: 0.0000, map_loss_line_2: 0.7175, map_loss_cls_3: 0.0000, map_loss_line_3: 0.9981, map_loss_cls_4: 0.0000, map_loss_line_4: 0.9802, map_loss_cls_5: 0.0000, map_loss_line_5: 0.9968, loss_dense_depth: 0.4719, loss: 13.1137, grad_norm: 12.8054
Do you have any experience on that?