Tsinghua-MARS-Lab / futr3d

Code for paper: FUTR3D: a unified sensor fusion framework for 3d detection
Apache License 2.0
272 stars 39 forks source link

There are many problems when evalutating #7

Closed backkon closed 1 year ago

backkon commented 2 years ago

There are many problems when evalutating every 2 epochs, such as:

File "/home/futr3d/tools/train.py", line 260, in main meta=meta) File "/home/mmdetection3d/mmdet3d/apis/train.py", line 351, in train_model meta=meta) File "/home/mmdetection3d/mmdet3d/apis/train.py", line 319, in train_detector runner.run(data_loaders, cfg.workflow) File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 136, in run epoch_runner(data_loaders[i], kwargs) File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 58, in train self.call_hook('after_train_epoch') File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 317, in call_hook getattr(hook, fn_name)(self) File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/hooks/evaluation.py", line 271, in after_train_epoch self._do_evaluate(runner) File "/home/mmdetection/mmdet/core/evaluation/eval_hooks.py", line 63, in _do_evaluate key_score = self.evaluate(runner, results) File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/hooks/evaluation.py", line 368, in evaluate results, logger=runner.logger, self.eval_kwargs) File "/home/futr3d/plugin/futr3d/datasets/nuscenes_radar.py", line 465, in evaluate result_files, tmp_dir = self.format_results(results, jsonfile_prefix) File "/home/futr3d/plugin/futr3d/datasets/nuscenes_radar.py", line 435, in format_results {name: self._formatbbox(results, tmpfile)}) File "/home/futr3d/plugin/futr3d/datasets/nuscenes_radar.py", line 306, in _format_bbox for i, box in enumerate(boxes): TypeError: 'NoneType' object is not iterable

Could you please review the code again? Thank you!

TaekHyungCho commented 2 years ago

Check your File "/home/futr3d/plugin/futr3d/datasets/nuscenes_radar.py", line 611, in lidar_nusc_box_to_global

The original code returns nothing, so you can solve this problem by modifying that line to "return box_list".

backkon commented 2 years ago

Thank you for your reply. After I modified it, I verified the trained model (24 epochs), and the results were as follows:

`Formating bboxes of pts_bbox Start to convert detection format... [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 6019/6019, 15.5 task/s, elapsed: 388s, ETA: 0s Results writes to /tmp/tmp7m884kmw/results/pts_bbox/results_nusc.json Evaluating bboxes of pts_bbox mAP: 0.0242
mATE: 1.0889 mASE: 0.5221 mAOE: 0.8317 mAVE: 0.6649 mAAE: 0.2637 NDS: 0.1839 Eval time: 327.0s

Per-class results: Object Class AP ATE ASE AOE AVE AAE car 0.178 1.012 0.186 0.411 0.609 0.163 truck 0.008 1.188 0.366 0.600 0.608 0.185 bus 0.005 1.154 0.299 0.200 0.991 0.164 trailer 0.004 1.299 0.264 0.977 0.167 0.019 construction_vehicle 0.000 1.000 1.000 1.000 1.000 1.000 pedestrian 0.047 1.071 0.315 1.065 0.705 0.339 motorcycle 0.001 1.134 0.444 1.054 0.773 0.093 bicycle 0.000 1.032 0.347 1.179 0.466 0.148 traffic_cone 0.000 1.000 1.000 nan nan nan barrier 0.000 1.000 1.000 1.000 nan nan`

I checked out the logs of the whole training process, found that the loss did not drop a lot, and the final line is as follows:

2022-08-14 20:36:56,175 - mmdet - INFO - Epoch [24][4650/4689] lr: 2.000e-06, eta: 0:00:57, time: 1.480, data_time: 0.055, memory: 24682, loss_cls: 0.5038, loss_bbox: 0.6866, d0.loss_cls: 0.6687, d0.loss_bbox: 0.9603, d1.loss_cls: 0.5810, d1.loss_bbox: 0.8191, d2.loss_cls: 0.5539, d2.loss_bbox: 0.7556, d3.loss_cls: 0.5324, d3.loss_bbox: 0.7230, d4.loss_cls: 0.5128, d4.loss_bbox: 0.6994, loss: 7.9966, grad_norm: 44.0604

I don't know whether this is normal? I followed the guideline provided by authors and performed the training.

TaekHyungCho commented 2 years ago

mAP: 0.2964 mATE: 0.8759 mASE: 0.7053 mAOE: 1.5572 mAVE: 0.8988 mAAE: 0.2154 NDS: 0.2787 Eval time: 346.5s

Per-class results: Object Class AP ATE ASE AOE AVE AAE car 0.420 0.835 0.748 1.599 1.052 0.186 truck 0.248 0.932 0.790 1.606 1.037 0.243 bus 0.320 0.896 0.849 1.556 2.286 0.374 trailer 0.160 1.061 0.850 1.601 0.595 0.168 construction_vehicle 0.072 0.998 0.689 1.547 0.133 0.372 pedestrian 0.353 0.865 0.321 1.572 0.496 0.205 motorcycle 0.260 0.908 0.788 1.545 1.125 0.156 bicycle 0.253 0.812 0.802 1.754 0.466 0.020 traffic_cone 0.447 0.695 0.327 nan nan nan barrier 0.431 0.758 0.888 1.234 nan nan

I didn't train my own model yet, however the author's pretrained model performed well.

Did you train your model by cam_only.py? Can you share your environment setting (e.g the num of gpu) and how long did you take for training?

xyaochen commented 2 years ago

Thank you for your reply. After I modified it, I verified the trained model (24 epochs), and the results were as follows:

`Formating bboxes of pts_bbox Start to convert detection format... [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 6019/6019, 15.5 task/s, elapsed: 388s, ETA: 0s Results writes to /tmp/tmp7m884kmw/results/pts_bbox/results_nusc.json Evaluating bboxes of pts_bbox mAP: 0.0242 mATE: 1.0889 mASE: 0.5221 mAOE: 0.8317 mAVE: 0.6649 mAAE: 0.2637 NDS: 0.1839 Eval time: 327.0s

Per-class results: Object Class AP ATE ASE AOE AVE AAE car 0.178 1.012 0.186 0.411 0.609 0.163 truck 0.008 1.188 0.366 0.600 0.608 0.185 bus 0.005 1.154 0.299 0.200 0.991 0.164 trailer 0.004 1.299 0.264 0.977 0.167 0.019 construction_vehicle 0.000 1.000 1.000 1.000 1.000 1.000 pedestrian 0.047 1.071 0.315 1.065 0.705 0.339 motorcycle 0.001 1.134 0.444 1.054 0.773 0.093 bicycle 0.000 1.032 0.347 1.179 0.466 0.148 traffic_cone 0.000 1.000 1.000 nan nan nan barrier 0.000 1.000 1.000 1.000 nan nan`

I checked out the logs of the whole training process, found that the loss did not drop a lot, and the final line is as follows:

2022-08-14 20:36:56,175 - mmdet - INFO - Epoch [24][4650/4689] lr: 2.000e-06, eta: 0:00:57, time: 1.480, data_time: 0.055, memory: 24682, loss_cls: 0.5038, loss_bbox: 0.6866, d0.loss_cls: 0.6687, d0.loss_bbox: 0.9603, d1.loss_cls: 0.5810, d1.loss_bbox: 0.8191, d2.loss_cls: 0.5539, d2.loss_bbox: 0.7556, d3.loss_cls: 0.5324, d3.loss_bbox: 0.7230, d4.loss_cls: 0.5128, d4.loss_bbox: 0.6994, loss: 7.9966, grad_norm: 44.0604

I don't know whether this is normal? I followed the guideline provided by authors and performed the training.

Hi, you can try to load pretrained cam_only model to finetune cam+radar. add "load_from=(your path)/cam_only.pth" in config

xyaochen commented 2 years ago

The model and codes are trained in an old environment, I didn't check it carefully before releasing out. Sorry for the inconvenient! I will check and release the new whole codes this week.

backkon commented 2 years ago

mAP: 0.2964 mATE: 0.8759 mASE: 0.7053 mAOE: 1.5572 mAVE: 0.8988 mAAE: 0.2154 NDS: 0.2787 Eval time: 346.5s

Per-class results: Object Class AP ATE ASE AOE AVE AAE car 0.420 0.835 0.748 1.599 1.052 0.186 truck 0.248 0.932 0.790 1.606 1.037 0.243 bus 0.320 0.896 0.849 1.556 2.286 0.374 trailer 0.160 1.061 0.850 1.601 0.595 0.168 construction_vehicle 0.072 0.998 0.689 1.547 0.133 0.372 pedestrian 0.353 0.865 0.321 1.572 0.496 0.205 motorcycle 0.260 0.908 0.788 1.545 1.125 0.156 bicycle 0.253 0.812 0.802 1.754 0.466 0.020 traffic_cone 0.447 0.695 0.327 nan nan nan barrier 0.431 0.758 0.888 1.234 nan nan

I didn't train my own model yet, however the author's pretrained model performed well.

Did you train your model by cam_only.py? Can you share your environment setting (e.g the num of gpu) and how long did you take for training?

No, I trained cam+radar model directly. My environment is as follows:

sys.platform: linux Python: 3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0] CUDA available: True GPU 0,1,2,3,4,5: Tesla V100-SXM2-32GB CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.3, V11.3.109 GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.10.0 PyTorch compiling details: PyTorch built with:

TorchVision: 0.11.0 OpenCV: 4.6.0 MMCV: 1.6.0 MMCV Compiler: GCC 9.3 MMCV CUDA Compiler: 11.3 MMDetection: 2.25.0 MMSegmentation: 0.26.0 MMDetection3D: 1.0.0rc3+unknown spconv2.0: False

backkon commented 2 years ago

Ok, I will try.

TaekHyungCho commented 2 years ago

Ok, I will try.

Can you share the result after training??

backkon commented 2 years ago

Give me your email, I will send the log file to your mailbox.

TaekHyungCho commented 2 years ago

Give me your email, I will send the log file to your mailbox. thcchhoo@yonsei.ac.kr
Many Thanks!

Freder-chen commented 2 years ago

1.0.0rc3+unknown

Is there nothing wrong with your environment please?It seems that an inconsistent mmdet3d is used, I found after installing mmdet3d==0.17.3 that there are unexpected keys for pre-trained weights, e.g. pts_bbox_head.attr_branches.xx.xx.weight. when I install version 0.13 as stated in the repository, there is a missing interface.

xyaochen commented 2 years ago

What do you mean missing interface?

Freder-chen commented 2 years ago

What do you mean missing interface?

Missing function. mmdet3d==0.13 missing train.py file. https://github.com/open-mmlab/mmdetection3d/blob/v0.13.0/mmdet3d/apis/

xyaochen commented 2 years ago

You can use the train.py in tools https://github.com/open-mmlab/mmdetection3d/blob/v0.13.0/tools/train.py and add codes that support plugin module in train.py like https://github.com/Tsinghua-MARS-Lab/futr3d/blob/main/tools/train.py#L107

Freder-chen commented 2 years ago

You can use the train.py in tools https://github.com/open-mmlab/mmdetection3d/blob/v0.13.0/tools/train.py and add codes that support plugin module in train.py like https://github.com/Tsinghua-MARS-Lab/futr3d/blob/main/tools/train.py#L107

Many thanks. Also, may I ask which version of mmdet is it? The detrhead used here is not implemented in the guide version. I refer to this link to install mmdet==2.11.0.

xyaochen commented 2 years ago

You can use the train.py in tools https://github.com/open-mmlab/mmdetection3d/blob/v0.13.0/tools/train.py and add codes that support plugin module in train.py like https://github.com/Tsinghua-MARS-Lab/futr3d/blob/main/tools/train.py#L107

Many thanks. Also, may I ask which version of mmdet is it? The detrhead used here is not implemented in the guide version. I refer to this link to install mmdet==2.11.0.

you can follow this guide https://github.com/open-mmlab/mmdetection3d/blob/master/docs/en/faq.md I think it will be ok if the version of mmdet and mmdet3d are compatible.

TaekHyungCho commented 2 years ago

You can use the train.py in tools https://github.com/open-mmlab/mmdetection3d/blob/v0.13.0/tools/train.py and add codes that support plugin module in train.py like https://github.com/Tsinghua-MARS-Lab/futr3d/blob/main/tools/train.py#L107

Is it necessary to use version of mmdetection3d==0.13.0 ? I installed the version of 1.0.0rc3. This version will effect the results? Or it's minor issue?

shb9793 commented 2 years ago

You can use the train.py in tools https://github.com/open-mmlab/mmdetection3d/blob/v0.13.0/tools/train.py and add codes that support plugin module in train.py like https://github.com/Tsinghua-MARS-Lab/futr3d/blob/main/tools/train.py#L107

When I ran train.py in https://github.com/Tsinghua-MARS-Lab/futr3d/blob/main/tools/train.py The error occured as follows:

plugin.futr3d
Traceback (most recent call last):
  File "tools/train.py", line 263, in <module>
    main()
  File "tools/train.py", line 119, in main
    plg_lib = importlib.import_module(_module_path)
  File "/HOME/scz3687/.conda/envs/open-mmlab/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'plugin'

I don't know exactly how to solve this problem?

xyaochen commented 2 years ago

You can use the train.py in tools https://github.com/open-mmlab/mmdetection3d/blob/v0.13.0/tools/train.py and add codes that support plugin module in train.py like https://github.com/Tsinghua-MARS-Lab/futr3d/blob/main/tools/train.py#L107

Is it necessary to use version of mmdetection3d==0.13.0 ? I installed the version of 1.0.0rc3. This version will effect the results? Or it's minor issue?

I'm not sure. It may cause some issues.

xyaochen commented 2 years ago

You can use the train.py in tools https://github.com/open-mmlab/mmdetection3d/blob/v0.13.0/tools/train.py and add codes that support plugin module in train.py like https://github.com/Tsinghua-MARS-Lab/futr3d/blob/main/tools/train.py#L107

When I ran train.py in https://github.com/Tsinghua-MARS-Lab/futr3d/blob/main/tools/train.py The error occured as follows:

plugin.futr3d
Traceback (most recent call last):
  File "tools/train.py", line 263, in <module>
    main()
  File "tools/train.py", line 119, in main
    plg_lib = importlib.import_module(_module_path)
  File "/HOME/scz3687/.conda/envs/open-mmlab/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'plugin'

I don't know exactly how to solve this problem?

It seems like you didn't use the code command right. Can you give me more info?

TaekHyungCho commented 2 years ago

You can use the train.py in tools https://github.com/open-mmlab/mmdetection3d/blob/v0.13.0/tools/train.py and add codes that support plugin module in train.py like https://github.com/Tsinghua-MARS-Lab/futr3d/blob/main/tools/train.py#L107

When I ran train.py in https://github.com/Tsinghua-MARS-Lab/futr3d/blob/main/tools/train.py The error occured as follows:

plugin.futr3d
Traceback (most recent call last):
  File "tools/train.py", line 263, in <module>
    main()
  File "tools/train.py", line 119, in main
    plg_lib = importlib.import_module(_module_path)
  File "/HOME/scz3687/.conda/envs/open-mmlab/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'plugin'

I don't know exactly how to solve this problem?

It seems like you didn't use the code command right. Can you give me more info?

Did you use tools/dist_train.sh bash file to run train.py? In that bash file, it converts PYTHONPATH, for example : export PYTHONPATH="tools/..":$PYTHONPATH If you don't wanna use bash file, convert PYTHONPATH as bash file do. Maybe it can solve the problem.

shb9793 commented 2 years ago

You can use the train.py in tools https://github.com/open-mmlab/mmdetection3d/blob/v0.13.0/tools/train.py and add codes that support plugin module in train.py like https://github.com/Tsinghua-MARS-Lab/futr3d/blob/main/tools/train.py#L107

Many thanks. Also, may I ask which version of mmdet is it? The detrhead used here is not implemented in the guide version. I refer to this link to install mmdet==2.11.0.

I have faced the same problem as follows. Did you succeed in solving the problem?

plugin.futr3d
Traceback (most recent call last):
  File "tools/train.py", line 236, in <module>
    main()
  File "tools/train.py", line 119, in main
    plg_lib = importlib.import_module(_module_path)
  File "/HOME/scz3687/.conda/envs/open-mmlab0130/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/data/run01/scz3687/openmmlab0130/mmdetection3d/plugin/futr3d/__init__.py", line 5, in <module>
    from .models.dense_head.detr_mdfs_head import DeformableFUTR3DHead
  File "/data/run01/scz3687/openmmlab0130/mmdetection3d/plugin/futr3d/models/dense_head/detr_mdfs_head.py", line 15, in <module>
    from mmdet.models.dense_heads import DeformableDETRHead, DETRHead
ImportError: cannot import name 'DeformableDETRHead' from 'mmdet.models.dense_heads' (/HOME/scz3687/.conda/envs/open-mmlab0130/lib/python3.7/site-packages/mmdet/models/dense_heads/__init__.py)
kkkcx commented 2 years ago

You can use the train.py in tools https://github.com/open-mmlab/mmdetection3d/blob/v0.13.0/tools/train.py and add codes that support plugin module in train.py like https://github.com/Tsinghua-MARS-Lab/futr3d/blob/main/tools/train.py#L107

When I ran train.py in https://github.com/Tsinghua-MARS-Lab/futr3d/blob/main/tools/train.py The error occured as follows:

plugin.futr3d
Traceback (most recent call last):
  File "tools/train.py", line 263, in <module>
    main()
  File "tools/train.py", line 119, in main
    plg_lib = importlib.import_module(_module_path)
  File "/HOME/scz3687/.conda/envs/open-mmlab/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'plugin'

I don't know exactly how to solve this problem?

May I ask you how to solve the 'ModuleNotFoundError: No module named 'plugin'' problem? Thank you so much.

shb9793 commented 2 years ago

You can use the train.py in tools https://github.com/open-mmlab/mmdetection3d/blob/v0.13.0/tools/train.py and add codes that support plugin module in train.py like https://github.com/Tsinghua-MARS-Lab/futr3d/blob/main/tools/train.py#L107

When I ran train.py in https://github.com/Tsinghua-MARS-Lab/futr3d/blob/main/tools/train.py The error occured as follows:

plugin.futr3d
Traceback (most recent call last):
  File "tools/train.py", line 263, in <module>
    main()
  File "tools/train.py", line 119, in main
    plg_lib = importlib.import_module(_module_path)
  File "/HOME/scz3687/.conda/envs/open-mmlab/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'plugin'

I don't know exactly how to solve this problem?

May I ask you how to solve the 'ModuleNotFoundError: No module named 'plugin'' problem? Thank you so much.

You can use relative path as https://github.com/Tsinghua-MARS-Lab/futr3d/blob/287274e3acda5883853d325e1ed09a76664cc2dc/plugin/futr3d/configs/lidar_cam/res101_01voxel_step_3e.py#L6 in your config file. Do not use absolute path.

lzzzzzm commented 1 year ago

When I run test.py, I encounter the same problem, my code is: python tools/test.py plugin/futr3d/configs/cam_radar/res101_radar.py cam_rader.pth --out out.pkl The error occur: return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 1014, in _gcd_import File "<frozen importlib._bootstrap>", line 991, in _find_and_load File "<frozen importlib._bootstrap>", line 973, in _find_and_load_unlocked ModuleNotFoundError: No module named 'plugin' How to solve with this problem