Closed HaoZhang534 closed 2 years ago
The function obtain_history_bev
in [BEVFormer] (https://github.com/zhiqi-li/BEVFormer/blob/5d42632256c64742f74d8b1c68a3407dd2f81305/projects/mmdet3d_plugin/bevformer/detectors/bevformer.py#L158) seems to have conflict to FP16. Setting find_unused_parameters=True
still won't help. In our original codebase, we use a copy of the BEVFormer model to generate previous BEV features, however, this will cause the pipeline being ugly and hard to understand. So the open-source codes of BEVFormer actually doesn't support FP16. Sorry for that.
Got it. Thank you.
Dear Zhiqi, I add some code in _obtain_historybev. The unused parameters are gone but the training becomes very slow. Is this the right way?
Actually, while I used fp16 in our original framework, the training speed is even faster than this open-source version. I am not sure whether your implementation can work normally by this way. Our implementation is more like MOCO, one BEVFormer model A used to train, another BEVFormer model B used to eval previous frames. B copies weight from A after every step and does not calculate the gradients. In this way, We can train model A with FP16.
Hi, we support FP16 in this way ./tools/fp16/dist_train.sh ./projects/configs/bevformer_fp16/bevformer_tiny_fp16.py 8
.
Dear authors, sorry for bothering you again. When I set _fp16 = dict(lossscale=512.), I found I have to also set _find_unusedparameters=True, or there will be an error. I wonder is this the right way to use fp16 in your code, because setting _find_unusedparameters=True may largely slow down the training.