hustvl / MIMDet

[ICCV 2023] You Only Look at One Partial Sequence
https://arxiv.org/abs/2204.02964
MIT License
336 stars 31 forks source link

About sincos_pos_embed #15

Closed qiy20 closed 2 years ago

qiy20 commented 2 years ago

(https://github.com/hustvl/MIMDet/blob/main/models/benchmarking.py#L592) I don't understand why do we need to do this? Because self.pos_embed has been initialized by resized pos-emded from pretrain checkpoint.

vealocia commented 2 years ago

Hi, @qiy20! We have tried to use learnable resized position embed, it gets about 0.6 bbox AP and 0.4 segm AP improvements. Code and result will be updated after more experiments are done. Thanks for your attention!

qiy20 commented 2 years ago

@vealocia Thanks for your reply! I want to know which type of pos_embed is used in the reproduction process to get 48.0AP(25e)

vealocia commented 2 years ago

@vealocia Thanks for your reply! I want to know which type of pos_embed is used in the reproduction process to get 48.0AP(25e)

48.0 AP is achieved with sincos_pos_embed=True. So it means a re-initialized frozen sin/cos position embedding.

qiy20 commented 2 years ago

Many thanks!

Yuxin-CV commented 2 years ago

I believe the issue at hand was addressed, as such I'm closing this. Feel free to ask if you have further questions.