Epiphqny / VisTR

[CVPR2021 Oral] End-to-End Video Instance Segmentation with Transformers
https://arxiv.org/abs/2011.14503
Apache License 2.0
738 stars 96 forks source link

About the performance difference in different version of arxiv paper #69

Closed Scalsol closed 2 years ago

Scalsol commented 2 years ago

Hi authors, thanks for this amazing work! I have some questions about the performance difference in different version of this paper on arxiv. In the first version, the performance of R-50 is 34.4, and in the final version, the performance is 36.2. Could you help explain what modification you make leads to this performance boost? Thanks!

Epiphqny commented 2 years ago

Hi @Scalsol, the difference is in data augmentation, https://github.com/Epiphqny/VisTR/blob/445c9e4e787a1fb3c959d7e7bb6ecf809bdac155/datasets/ytvos.py#L157

Scalsol commented 2 years ago

ok, thanks!