Closed fdy61 closed 6 months ago
I use the command given by the author in the github,
请问您解决了吗?办法是什么呢?
请问您解决了吗?办法是什么呢?
我使用了bevfusion-det.pth没有问题
No problem.I had Solved it.
Have you been successful in your recovery?I am reproducing the code, but my training accuracy is far from reaching. How should I handle this? Can you give me some suggestions?The training is for bevfusion det(L+C), which can output two modalities through visualization, but the training accuracy is too low(NDS=0.4665). I earnestly request your reply and help!The system is ubuntu20.04, and the GPU is a single a100. Although there are four a100, parallel training is not feasible and can only temporarily use a single GPU
Have you been successful in your recovery?I am reproducing the code, but my training accuracy is far from reaching. How should I handle this? Can you give me some suggestions?The training is for bevfusion det(L+C), which can output two modalities through visualization, but the training accuracy is too low(NDS=0.4665). I earnestly request your reply and help!The system is ubuntu20.04, and the GPU is a single a100. Although there are four a100, parallel training is not feasible and can only temporarily use a single GPU
It seems that you should train the lidar model first to get lidar-only-det.pth, and then use the lidar-only-det.pth and swint-nuimages-pretrained.pth to train the final model
Have you been successful in your recovery?I am reproducing the code, but my training accuracy is far from reaching. How should I handle this? Can you give me some suggestions?The training is for bevfusion det(L+C), which can output two modalities through visualization, but the training accuracy is too low(NDS=0.4665). I earnestly request your reply and help!The system is ubuntu20.04, and the GPU is a single a100. Although there are four a100, parallel training is not feasible and can only temporarily use a single GPU
It seems that you should train the lidar model first to get lidar-only-det.pth, and then use the lidar-only-det.pth and swint-nuimages-pretrained.pth to train the final model
Thank you for the thoughts you have provided! I will try to use this method. May I ask if your reproduction is consistent with the results of the paper?
Have you been successful in your recovery?I am reproducing the code, but my training accuracy is far from reaching. How should I handle this? Can you give me some suggestions?The training is for bevfusion det(L+C), which can output two modalities through visualization, but the training accuracy is too low(NDS=0.4665). I earnestly request your reply and help!The system is ubuntu20.04, and the GPU is a single a100. Although there are four a100, parallel training is not feasible and can only temporarily use a single GPU
It seems that you should train the lidar model first to get lidar-only-det.pth, and then use the lidar-only-det.pth and swint-nuimages-pretrained.pth to train the final model
Thank you for the thoughts you have provided! I will try to use this method. May I ask if your reproduction is consistent with the results of the paper?
Yes, but I just do the detection, not segmentation. For BEVFusion detection model, you should train it with your trained lidar model (or you can just use the pretrained/swint-nuimages-pretrained.pth the author has given) and the image backbone, pretrained/swint-nuimages-pretrained.pth. You can see the training order in README.md: torchpack dist-run -np 8 python tools/train.py configs/nuscenes/det/transfusion/secfpn/camera+lidar/swint_v0p075/convfuser.yaml --model.encoders.camera.backbone.init_cfg.checkpoint pretrained/swint-nuimages-pretrained.pth --load_from pretrained/lidar-only-det.pth no matter you train it in single GPU or in mutiple GPUs, it dosen't affect the final results.
Have you been successful in your recovery?I am reproducing the code, but my training accuracy is far from reaching. How should I handle this? Can you give me some suggestions?The training is for bevfusion det(L+C), which can output two modalities through visualization, but the training accuracy is too low(NDS=0.4665). I earnestly request your reply and help!The system is ubuntu20.04, and the GPU is a single a100. Although there are four a100, parallel training is not feasible and can only temporarily use a single GPU
It seems that you should train the lidar model first to get lidar-only-det.pth, and then use the lidar-only-det.pth and swint-nuimages-pretrained.pth to train the final model
Thank you for the thoughts you have provided! I will try to use this method. May I ask if your reproduction is consistent with the results of the paper?
Yes, but I just do the detection, not segmentation. For BEVFusion detection model, you should train it with your trained lidar model (or you can just use the pretrained/swint-nuimages-pretrained.pth the author has given) and the image backbone, pretrained/swint-nuimages-pretrained.pth. You can see the training order in README.md: torchpack dist-run -np 8 python tools/train.py configs/nuscenes/det/transfusion/secfpn/camera+lidar/swint_v0p075/convfuser.yaml --model.encoders.camera.backbone.init_cfg.checkpoint pretrained/swint-nuimages-pretrained.pth --load_from pretrained/lidar-only-det.pth no matter you train it in single GPU or in mutiple GPUs, it dosen't affect the final results.
I am running this instruction as it is, no changes in the code, read the terminal output carefully and realized that there is a lot of missing information, do you get this error during training and what can I do to fix it?
Have you been successful in your recovery?I am reproducing the code, but my training accuracy is far from reaching. How should I handle this? Can you give me some suggestions?The training is for bevfusion det(L+C), which can output two modalities through visualization, but the training accuracy is too low(NDS=0.4665). I earnestly request your reply and help!The system is ubuntu20.04, and the GPU is a single a100. Although there are four a100, parallel training is not feasible and can only temporarily use a single GPU
It seems that you should train the lidar model first to get lidar-only-det.pth, and then use the lidar-only-det.pth and swint-nuimages-pretrained.pth to train the final model
Thank you for the thoughts you have provided! I will try to use this method. May I ask if your reproduction is consistent with the results of the paper?
Yes, but I just do the detection, not segmentation. For BEVFusion detection model, you should train it with your trained lidar model (or you can just use the pretrained/swint-nuimages-pretrained.pth the author has given) and the image backbone, pretrained/swint-nuimages-pretrained.pth. You can see the training order in README.md: torchpack dist-run -np 8 python tools/train.py configs/nuscenes/det/transfusion/secfpn/camera+lidar/swint_v0p075/convfuser.yaml --model.encoders.camera.backbone.init_cfg.checkpoint pretrained/swint-nuimages-pretrained.pth --load_from pretrained/lidar-only-det.pth no matter you train it in single GPU or in mutiple GPUs, it dosen't affect the final results.
I am running this instruction as it is, no changes in the code, read the terminal output carefully and realized that there is a lot of missing information, do you get this error during training and what can I do to fix it?
This is just a warning, not an error. Because the entire BEVFusion detection model has the camera branch, but lidar-only-det.pth dosen't. The camera branch is loaded from swint-nuimages-pretrained.pth. So you just run the order and it's OK. You will find out that the model will train successfully.
Have you been successful in your recovery?I am reproducing the code, but my training accuracy is far from reaching. How should I handle this? Can you give me some suggestions?The training is for bevfusion det(L+C), which can output two modalities through visualization, but the training accuracy is too low(NDS=0.4665). I earnestly request your reply and help!The system is ubuntu20.04, and the GPU is a single a100. Although there are four a100, parallel training is not feasible and can only temporarily use a single GPU
It seems that you should train the lidar model first to get lidar-only-det.pth, and then use the lidar-only-det.pth and swint-nuimages-pretrained.pth to train the final model
Thank you for the thoughts you have provided! I will try to use this method. May I ask if your reproduction is consistent with the results of the paper?
Yes, but I just do the detection, not segmentation. For BEVFusion detection model, you should train it with your trained lidar model (or you can just use the pretrained/swint-nuimages-pretrained.pth the author has given) and the image backbone, pretrained/swint-nuimages-pretrained.pth. You can see the training order in README.md: torchpack dist-run -np 8 python tools/train.py configs/nuscenes/det/transfusion/secfpn/camera+lidar/swint_v0p075/convfuser.yaml --model.encoders.camera.backbone.init_cfg.checkpoint pretrained/swint-nuimages-pretrained.pth --load_from pretrained/lidar-only-det.pth no matter you train it in single GPU or in mutiple GPUs, it dosen't affect the final results.
I am running this instruction as it is, no changes in the code, read the terminal output carefully and realized that there is a lot of missing information, do you get this error during training and what can I do to fix it?
This is just a warning, not an error. Because the entire BEVFusion detection model has the camera branch, but lidar-only-det.pth dosen't. The camera branch is loaded from swint-nuimages-pretrained.pth. So you just run the order and it's OK. You will find out that the model will train successfully.
Thank you for your patience. But I was able to train it successfully, the main problem is still the same as the one at the beginning, the accuracy is not high and very different from the results of the paper. nuScenes-full ran three rounds with a single card A100 first and the NDS was always at 0.46, the original paper NDS=0.7288, which is too much difference. About that I have not been able to solve.
3 epochs?That is far from enough. You can learn more details from the training strategy
@wyy032 The author seemed to train about 6-7 rounds. I dont remember.
@wyy032 The author seemed to train about 6-7 rounds. I dont remember.
I haven't run the full 6 rounds yet, as this will probably take about three days, but the first three rounds I ran the accuracy was consistently at 0.46, and I'm not sure if that's normal? I'm worried it won't go up in accuracy behind it.
@wyy032 The author seemed to train about 6-7 rounds. I dont remember.
I haven't run the full 6 rounds yet, as this will probably take about three days, but the first three rounds I ran the accuracy was consistently at 0.46, and I'm not sure if that's normal? I'm worried it won't go up in accuracy behind it.
Training for only three epochs suggests that the model has not yet fitted the data well. And I'm not sure whether you can resume from the epoch_3.pth and continue training it until 6, or you will restart from 1 to 6. I suggest that you obey the Author's training strategy.
@wyy032 The author seemed to train about 6-7 rounds. I dont remember.
I haven't run the full 6 rounds yet, as this will probably take about three days, but the first three rounds I ran the accuracy was consistently at 0.46, and I'm not sure if that's normal? I'm worried it won't go up in accuracy behind it.
Training for only three epochs suggests that the model has not yet fitted the data well. And I'm not sure whether you can resume from the epoch_3.pth and continue training it until 6, or you will restart from 1 to 6. I suggest that you obey the Author's training strategy.
Ok, thank you very much for your advice, I'll try again.
@wyy032 The author seemed to train about 6-7 rounds. I dont remember.
I haven't run the full 6 rounds yet, as this will probably take about three days, but the first three rounds I ran the accuracy was consistently at 0.46, and I'm not sure if that's normal? I'm worried it won't go up in accuracy behind it.
Training for only three epochs suggests that the model has not yet fitted the data well. And I'm not sure whether you can resume from the epoch_3.pth and continue training it until 6, or you will restart from 1 to 6. I suggest that you obey the Author's training strategy.
Ok, thank you very much for your advice, I'll try again.
Hi, I've run six complete rounds in the last few days, but the accuracy still doesn't go up, I suspect it's a cuda version issue and no multi-card training. Can I ask which cuda version you are using?
11.3 or 11.1 @wyy032
11.3 or 11.1 @wyy032
ok,thankyou,I'll try again.
Ничего. Я решил ее.
Как вы решили такую ошибку? У меня такая же
No problem.I had Solved it.
How do you solve it? Wait for your reply. Thank you. @fdy61
@fdy61 Hi, when you test the fusion model using the official commands can you achieve thesis accuracy? torchpack dist-run -np 1 python tools/test.py configs/nuscenes/det/transfusion/secfpn/camera+lidar/swint_v0p075/convfuser.yaml pretrained /bevfusion-det.pth --eval bbox When I test the laser-only and camera-only models, I get mAP=0.6468, NDS=0.6924 and mAP=0.3554, NDS=0.4121, which is very similar to the paper results, but when I test the fusion model the results are only mAP=0.6728, NDS=0.7069, what could be the reason?
when I run the command of "torchpack dist-run -np 8 python tools/train.py configs/nuscenes/det/transfusion/secfpn/camera+lidar/swint_v0p075/convfuser.yaml --model.encoders.camera.backbone.init_cfg.checkpoint pretrained/swint-nuimages-pretrained.pth --load_from pretrained/lidar-only-det.pth",when loading lidar-only-det.pth,it will say "The model and loaded state dict do not match exactly" and there will be a RuntimeError: Given groups=1, weight of size [8, 1, 1, 1], expected input[24, 6, 256, 704] to have 1 channels, but got 6 channels instead, How can I solve it?