THU-MIG / RepViT

RepViT: Revisiting Mobile CNN From ViT Perspective [CVPR 2024] and RepViT-SAM: Towards Real-Time Segmenting Anything
https://arxiv.org/abs/2307.09283
Apache License 2.0
756 stars 56 forks source link

question about training recipe #38

Open CoinCheung opened 7 months ago

CoinCheung commented 7 months ago

Hi,

Thanks for bring the work to public !! I have a question about experiments in Table 5.

In the paper, it is claimed that the training method of repVIT is identical to mobilenet-v3L, which consists of many modern training tricks. I believe the model used in Table 5 is also trained with these. It shows that resnet18 is a bit faster than repVIT-M1.1, but its down-stream task performance is much worse. Does the resnet18 model used here is also trained with mobilenet-v3L recipe, or it is only the original resnet18 model trained for 100 epoch without other tricks?

jameslahm commented 7 months ago

Thanks for your interest. The ResNet18 model in Table 5 is also trained under the same recipe as RepViT. We refer the results of ResNet18 in Table 5 from PVT and EfficientFormer, etc.