Feobi1999 / TDD

Apache License 2.0
36 stars 2 forks source link

Result without FPN #4

Open Pandaxia8 opened 2 years ago

Pandaxia8 commented 2 years ago

Hi, thanks for your awesome project! When I dive into the detail of TDD, I find that the ResNet_FPN is used instead of pure ResNet, which will create an unfair comparison. As we know, FPN can provide a considerable improvement in the object detection framework. So, did the authors experiment without the FPN structure? Thanks a lot.

Pandaxia8 commented 2 years ago

Hi, thanks for your awesome project! When I dive into the detail of TDD, I find that the ResNet_FPN is used instead of pure ResNet, which will create an unfair comparison. As we know, FPN can provide a considerable improvement in the object detection framework. So, did the authors experiment without the FPN structure? Thanks a lot.

In my experiment, even without the TPP and dual branch, I still get 49.5AP when the backbone added the FPN structure.

image

Feobi1999 commented 2 years ago

The vgg-based experiments are experimented without FPN. The previous literature is not uniform on whether to use FPN, and res50+FPN is a more natural combination, so we did not do ablation of res50 without FPN.

Pandaxia8 commented 2 years ago

The vgg-based experiments are experimented without FPN. The previous literature is not uniform on whether to use FPN, and res50+FPN is a more natural combination, so we did not do ablation of res50 without FPN.

Thanks for your reply, I also noticed the superior performance of the vgg model in your paper, but after reading your code carefully, there are two unavoidable problems.

First, in your codes, vgg model uses BN layers, which again would create an unfair comparison.Similar questions are in the following links https://github.com/facebookresearch/adaptive_teacher/issues/16#issue-1261591338

Second, as in my experiments, I still obtained results beyond those in your paper without using the TPP and dual branch modules. I use 4 NVIDIA GeForce 2080 Ti GPUs for training in my experiments. Each mini-batch contains 8 images per GPU. And the result shows that the FPN structure makes a difference of 6-7% to the AP50, which is not to be overlooked.

In addition, I don't seem to find models that previously used R50 while also using the FPN structure. Just as the adaptive detection frameworks based on the transformer structure are using pure R50 models.

Looking forward to your reply, Thanks.

sysuzgg commented 1 year ago

@Pandaxia8 @Feobi1999 你好,请教一下config文件中的TRAINL (target-like)数据集是怎么生成的呢?虽然设置了这个数据集,但是在训练时并没有用到这个数据集?这是为什么?另外这个模型的训练流程是怎样的呢?(我直接在config文件中Trainer:TDD, BURN_UP_STEP: 10000这样训练么?我看readme中要分2步,有点不明白)