lyuwenyu / RT-DETR

[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
Apache License 2.0
2.31k stars 259 forks source link

Questions about the performance analysis of `FrozenBatchNorm2d` #205

Open DrRyanHuang opened 7 months ago

DrRyanHuang commented 7 months ago

Hi, Happy Spring Festival, thx for your great work!

I did a performance analysis on the inference of the torch code and it seems that the reshape operation in FrozenBatchNorm2d (src/nn/backbone/common.py) becomes a bottleneck

image

Is there any way to solve this problem?

lyuwenyu commented 7 months ago

A simple solution is to replace FrozenBatchNorm2d with BatchNorm2d before deployment. You can do this by adding a member function convert_to_deploy to the backbone.

def convert_to_deploy(self, ):
    # code repleace `FrozenBatchNorm2d` with `BatchNorm2d`

See this call stack

  1. https://github.com/lyuwenyu/RT-DETR/blob/main/rtdetr_pytorch/tools/export_onnx.py#L37
  2. https://github.com/lyuwenyu/RT-DETR/blob/main/rtdetr_pytorch/src/zoo/rtdetr/rtdetr.py#L39

If there are any results, you can provide feedback.