hasanirtiza / Pedestron

[Pedestron] Generalizable Pedestrian Detection: The Elephant In The Room. @ CVPR2021
Apache License 2.0
682 stars 159 forks source link

Test/Demo generate blank results using Faster R-CNN trained on ECP, CityPersons. From the other side, the Faster R-CNN hrnet model does not converge. #138

Closed AndyVerne closed 2 years ago

AndyVerne commented 2 years ago

Thanks for your error report and we appreciate it a lot.


  1. I have searched related issues but cannot get the expected help. checked
  2. The bug has not been fixed in the latest version. checked

Describe the bug A clear and concise description of what the bug is.

When I tried to train the Faster R-CNN model via python tools/train.py configs/elephant/cityperson/faster_rcnn_hrnet.py, the model trained generated blank results like below: image The same results happened after I chose the ECP as the training method via python tools/train.py configs/elephant/eurocity/faster_rcnn_hrnet.py.

Meanwhile, when I use the cascade mask R-CNN as the training method via python tools/train.py configs/elephant/cityperson/cascade_hrnet.py. Everything works. image

I really have no clue why this happens. Any help is appreciated.


  1. What command or script did you run?
    training command:
  2. python tools/train.py configs/elephant/cityperson/faster_rcnn_hrnet.py
  3. python tools/train.py configs/elephant/cityperson/cascade_hrnet.py

demo command:

  1. python tools/demo.py configs/elephant/cityperson/faster_rcnn_hrnet.py ./work_dirs/cityperson_faster_rcnn_hrnetv2p_w32/epoch_3.pth.stu demo/ result_demo_faster_r-cnn/
  2. tools/demo.py configs/elephant/cityperson/cascade_hrnet.py ./work_dirs/cityperson_cascade_rcnn_hrnetv2p_w32/epoch_3.pth.stu demo/ result_demo/

A placeholder for the command.

2. Did you make any modifications on the code or config? Did you understand what you have modified?
3. What dataset did you use?
Test on ECP and CityPersons, both of two faster r-cnn methods doesn't work
 - OS: [e.g., Ubuntu 16.04.6] 
   Ubuntu 16.04.6
 - GCC [e.g., 5.4.0]
 - PyTorch version [e.g., 1.1.0]
- How you installed PyTorch [e.g., pip, conda, source]
- GPU model [e.g., 1080Ti, V100] 
- CUDA and CUDNN version

**Error traceback**
If applicable, paste the error trackback here.

Pedestron/tools/../mmdet/apis/inference.py:39: UserWarning: Class names are not saved in the checkpoint's meta data, use COCO classes by default. warnings.warn('Class names are not saved in the checkpoint\'s '

***From the other side the model does not converge when training***

**Bug fix**
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!
AndyVerne commented 2 years ago

More specific, the loss_rnp_cls doesn't converge.

hasanirtiza commented 2 years ago

Did you try changing the hyperparams ? To get it straight, you can train Cascade RCNN, but not Faster RCNN ?

AndyVerne commented 2 years ago

Did you try changing the hyperparams ? To get it straight, you can train Cascade RCNN, but not Faster RCNN ?

Thanks for the reply. I didn't change the hyperparams, the Cascade RCNN is fine. The Faster RCNN with HRNet doesn't work. Meanwhile the Faster RCNN with ResNet101 works out. I have no clue how to deal with it.

hasanirtiza commented 2 years ago

Then it is hyperparams most probably. Play around the learning rate, learning rate in this repo is set with 8 Gpus. If your number of gpus are less, use the linear scaling rule to adjust learning rate.

AndyVerne commented 2 years ago

Then it is hyperparams most probably. Play around the learning rate, learning rate in this repo is set with 8 Gpus. If your number of gpus are less, use the linear scaling rule to adjust learning rate.

Thank you so much. Really appreciate for replies! I will give it a try and update the feedback soon. :)