KeyError while evaluating and AttributeError while training.

GDMG99 commented 1 year ago

Dear authors, thank you for your great work. I was trying to use the network for both training and evaluation. I set up the environtment using the Dockerfile that the authors provide and followed the required steps to clone and setup. I have also downloaded the checkpoints required. The only thing i did differently is using the info files provided by mmdetection3d istead of re-generating the files using this codebase. When I run the evaluation command on the detection BEVFusion network (torchpack dist-run -np 1 python tools/test.py configs/nuscenes/det/transfusion/secfpn/camera+lidar/swint_v0p075/convfuser.yaml pretrained/bevfusion-det.pth --eval bbox) I get the following KeyError: File "tools/test.py", line 230, in main() File "tools/test.py", line 203, in main outputs = multi_gpu_test(model, data_loader, args.tmpdir, args.gpu_collect) File "/opt/conda/lib/python3.8/site-packages/mmdet/apis/test.py", line 96, in multi_gpu_test for i, data in enumerate(data_loader): File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 521, in next data = self._next_data() File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data return self._process_data(data) File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data data.reraise() File "/opt/conda/lib/python3.8/site-packages/torch/_utils.py", line 434, in reraise raise exception KeyError: Caught KeyError in DataLoader worker process 0. Original Traceback (most recent call last): File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop data = fetcher.fetch(index) File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/bevfusion/mmdet3d/datasets/custom_3d.py", line 291, in getitem return self.prepare_test_data(idx) File "/home/bevfusion/mmdet3d/datasets/custom_3d.py", line 180, in prepare_test_data input_dict = self.get_data_info(index) File "/home/bevfusion/mmdet3d/datasets/nuscenes_dataset.py", line 218, in get_data_info location=info["location"], KeyError: 'location'

On the other hand, while training with the training command (torchpack dist-run -np 1 python tools/train.py configs/nuscenes/det/transfusion/secfpn/camera+lidar/swint_v0p075/convfuser.yaml --model.encoders.camera.backbone.init_cfg.checkpoint pretrained/swint-nuimages-pretrained.pth --load_from pretrained/lidar-only-det.pth ) I get the following Attribute error: File "tools/train.py", line 87, in main() File "tools/train.py", line 76, in main train_model( File "/home/bevfusion/mmdet3d/apis/train.py", line 126, in train_model runner.run(data_loaders, [("train", 1)]) File "/opt/conda/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 108, in run self.call_hook('before_run') File "/opt/conda/lib/python3.8/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook getattr(hook, fn_name)(self) File "/opt/conda/lib/python3.8/site-packages/mmcv/runner/dist_utils.py", line 94, in wrapper return func(*args, **kwargs) File "/opt/conda/lib/python3.8/site-packages/mmcv/runner/hooks/logger/tensorboard.py", line 35, in before_run from torch.utils.tensorboard import SummaryWriter File "/opt/conda/lib/python3.8/site-packages/torch/utils/tensorboard/init.py", line 4, in LooseVersion = distutils.version.LooseVersion AttributeError: module 'distutils' has no attribute 'version'

Does anybody know how to solve this? Is this issue due to the usage of the wrong info files?

Thanks in advance

GDMG99 commented 1 year ago

I managed to solve the Attribute Error issue by downgrading the setuptools package to a previous version (pip install setuptools==59.5.0) After that the same KeyError: 'location' error emerged. Just as the evaluation.

GDMG99 commented 1 year ago

The problem does not appear using the info files created with this database.

hasaikeyQAQ commented 1 year ago

Hello, I have encountered the same problem: Traceback (most recent call last): File "tools/train.py", line 87, in main() File "tools/train.py", line 76, in main train_model( File "/public/home/nngallery/test/bevfusion/mmdet3d/apis/train.py", line 126, in train_model runner.run(data_loaders, [("train", 1)]) File "/public/home/nngallery/miniconda3/envs/openmmlab_cuda11.1/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 108, in run self.call_hook('before_run') File "/public/home/nngallery/miniconda3/envs/openmmlab_cuda11.1/lib/python3.8/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook getattr(hook, fn_name)(self) File "/public/home/nngallery/miniconda3/envs/openmmlab_cuda11.1/lib/python3.8/site-packages/mmcv/runner/dist_utils.py", line 94, in wrapper return func(*args, **kwargs) File "/public/home/nngallery/miniconda3/envs/openmmlab_cuda11.1/lib/python3.8/site-packages/mmcv/runner/hooks/logger/tensorboard.py", line 35, in before_run from torch.utils.tensorboard import SummaryWriter File "/public/home/nngallery/miniconda3/envs/openmmlab_cuda11.1/lib/python3.8/site-packages/torch/utils/tensorboard/init.py", line 4, in LooseVersion = distutils.version.LooseVersion AttributeError: module 'distutils' has no attribute 'version'. However, I did not understand your comment. Could you further explain how I can solve this problem.

GDMG99 commented 1 year ago

Hello, I used the Dockerfile provided by the authors to work with the repository. When I trained a network an Atribute Error came up. After some research it turns out that the setuptools library got updated and changed some of its features. If you downgrade the setuptools library to 59.5.0 it works fine (at least for me). Just run (pip install setuptools==59.5.0) Best

hasaikeyQAQ commented 1 year ago

Hello, I have solved this problem using the method you provided. Thank you very much. Best wishes

Qizhi697 commented 10 months ago

The problem does not appear using the info files created with this database.

Hi, How to solve the 'KeyError while evaluating' please. I am also encountering this issue now and look forward to your reply. Thank you! Best wishes

EpicGilgamesh commented 10 months ago

@Qizhi697 Hi, have you managed to solve your issue?

mit-han-lab / bevfusion

KeyError while evaluating and AttributeError while training. #348