eladb3 / ORViT

"Object-Region Video Transformers”, Herzig et al., CVPR 2022
Apache License 2.0
42 stars 12 forks source link

Object Region Attention block #8

Closed AshwinRamachandran2002 closed 2 years ago

AshwinRamachandran2002 commented 2 years ago

The code uses a different method than that mentioned in the paper after ROI ALIGN The paper has the MLP layer after Max Pooling but the code has the MLP layer first and then does Max Pooling

eladb3 commented 2 years ago

Our next revision will correct this. Thanks for pointing this out.