Closed ZhiyingDu closed 3 months ago
Excuse me, i set the batchsize=4, and run this command"scripts/dist_train.sh 8 runner=8gpus_t +exp=rawbox_mv2.0t_0.4.4" in A100, it throw a exception when it traind in 25 step, Could you have some suggetions to solve it?
Maybe i can solve this error by set ret_dict['kwargs']['bboxes_3d_data'] to {} if bboxes_3d_data is None, but i'm not sure if this error is indirectly caused by other part of code or if there is something wrong with my data itself. Could you give me some suggestions? Sincere thanks!
You are correct. We should allow bboxes_3d_data
to be None
.
But if i set the batch_size
to 1, no errors occurred. This is what I am confused about. Why is bboxes_3d_data
not be None
when batchsize is set to 1? But when I set batchsize
to 2 or 4, it will be None
?
It is normal to observe None in very rare cases, but it may happen to some samples. If you have None constantly, there may be something wrong.
I have met the same problem when I set the batchsize to 4. And when I follow the above method to fix this problem, I will meet the new error Keyerror: cam_param
, have you met this problem @ZhiyingDu ?
I have met the same problem when I set the batchsize to 4. And when I follow the above method to fix this problem, I will meet the new error
Keyerror: cam_param
, have you met this problem @ZhiyingDu ?
Yes, I encountered the same problem. When bboxes_3d_data is empty, you need to complete the code to initialize cam_param.
Hi @ZhiyingDu @aipursuing
Could you help me check whether the pr above fixes the issue when bbox is none?
Hi @ZhiyingDu @aipursuing
Could you help me check whether the pr above fixes the issue when bbox is none?
I think it will be fine. I did unconditional initialization and it also works fine after changing. My ugly Code(/doge):
Hi @ZhiyingDu @aipursuing
Could you help me check whether the pr above fixes the issue when bbox is none?
My solution is similar to yours, but I set bbox_3d_data = {}
instead of {"cam_params":ret_dict['camera_param']}
. I used your code to run almost 1 hour without any error, so I think there is no problem.
@aipursuing Doing so may also solve your issue with Keyerror: cam_params
. Thank you for your feedback!
@ZhiyingDu I do not think you should allocate parameters here or use the unconditional cam, while null embedding for bboxes should be handled in bbox embedder according to the mask
. Anyway, cam_params
is useless in this codebase. Let's not worry about the value of it.
Excuse me, i set the batchsize=4, and run this command"scripts/dist_train.sh 8 runner=8gpus_t +exp=rawbox_mv2.0t_0.4.4" in A100, it throw a exception when it traind in 25 step, Could you have some suggetions to solve it?