open-mmlab / mmdetection3d

OpenMMLab's next-generation platform for general 3D object detection.
https://mmdetection3d.readthedocs.io/en/latest/
Apache License 2.0
5.16k stars 1.52k forks source link

[Bug] Evaluate multi-view dfm on Waymo dataset #1999

Open YongtaoGe opened 1 year ago

YongtaoGe commented 1 year ago

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

1.1x branch https://github.com/open-mmlab/mmdetection3d/tree/1.1

Environment

none

Reproduces the problem - code sample

none

Reproduces the problem - command or script

CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./tools/dist_test.sh \
configs/dfm/multiview-dfm_r101-dcn_16xb2_waymoD5-3d-3class.py \
work_dirs/multiview-dfm_r101-dcn_16xb2_waymoD5-3d-3class/epoch_8.pth 4

Reproduces the problem - error message

Traceback (most recent call last): File "./tools/test.py", line 121, in main() File "./tools/test.py", line 117, in main runner.test() File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-automl/geyongtao/code/mmengine/mmengine/runner/runner.py", line 1707, in test metrics = self.test_loop.run() # type: ignore File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-automl/geyongtao/code/mmengine/mmengine/runner/loops.py", line 420, in run metrics = self.evaluator.evaluate(len(self.dataloader.dataset)) File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-automl/geyongtao/code/mmengine/mmengine/evaluator/evaluator.py", line 79, in evaluate _results = metric.evaluate(size) File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-automl/geyongtao/code/mmengine/mmengine/evaluator/metric.py", line 110, in evaluate _metrics = self.compute_metrics(results) # type: ignore File "/mnt/beegfs/ssd_pool/docker/user/hadoop-automl/geyongtao/code/dfm/mmdet3d/evaluation/metrics/waymo_metric.py", line 151, in compute_metrics classes=self.classes) File "/mnt/beegfs/ssd_pool/docker/user/hadoop-automl/geyongtao/code/dfm/mmdet3d/evaluation/metrics/waymo_metric.py", line 318, in format_results classes) File "/mnt/beegfs/ssd_pool/docker/user/hadoop-automl/geyongtao/code/dfm/mmdet3d/evaluation/metrics/kitti_metric.py", line 285, in format_results submissionprefix) File "/mnt/beegfs/ssd_pool/docker/user/hadoop-automl/geyongtao/code/dfm/mmdet3d/evaluation/metrics/waymo_metric.py", line 459, in bbox2result_kitti box_dict = self.convert_valid_bboxes(pred_dicts, info) File "/mnt/beegfs/ssd_pool/docker/user/hadoop-automl/geyongtao/code/dfm/mmdet3d/evaluation/metrics/waymo_metric.py", line 590, in convert_valid_bboxes sample_idx = info['sample_id'] KeyError: 'sample_id' ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 2954) of binary: /mnt/beegfs/ssd_pool/docker/user/hadoop-automl/geyongtao/anaconda3/envs/dfm/bin/python Traceback (most recent call last): File "/mnt/beegfs/ssd_pool/docker/user/hadoop-automl/geyongtao/anaconda3/envs/dfm/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/mnt/beegfs/ssd_pool/docker/user/hadoop-automl/geyongtao/anaconda3/envs/dfm/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/mnt/beegfs/ssd_pool/docker/user/hadoop-automl/geyongtao/anaconda3/envs/dfm/lib/python3.7/site-packages/torch/distributed/launch.py", line 193, in main() File "/mnt/beegfs/ssd_pool/docker/user/hadoop-automl/geyongtao/anaconda3/envs/dfm/lib/python3.7/site-packages/torch/distributed/launch.py", line 189, in main launch(args) File "/mnt/beegfs/ssd_pool/docker/user/hadoop-automl/geyongtao/anaconda3/envs/dfm/lib/python3.7/site-packages/torch/distributed/launch.py", line 174, in launch run(args) File "/mnt/beegfs/ssd_pool/docker/user/hadoop-automl/geyongtao/anaconda3/envs/dfm/lib/python3.7/site-packages/torch/distributed/run.py", line 692, in run )(*cmd_args) File "/mnt/beegfs/ssd_pool/docker/user/hadoop-automl/geyongtao/anaconda3/envs/dfm/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 116, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/beegfs/ssd_pool/docker/user/hadoop-automl/geyongtao/anaconda3/envs/dfm/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent failures=result.failures, torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

Additional information

No response

ZCMax commented 1 year ago

@lianqing01 @Tai-Wang Please have a look at this issue

XYAskWhy commented 1 year ago

@YongtaoGe hi, I have the same problem with you. Had you solved this? What did you do? It seemed like a dataset version problem. The waymo dataset version I am using is 1.2.

YongtaoGe commented 1 year ago

@XYAskWhy It is caused by a typo error. Changing 'sample_id' to 'sample_idx' would fix the problem.

YongtaoGe commented 1 year ago

By the way, how to generate cam_gt.bin for the multi-view camera-only waymo dataset in dev-1.x branch?