kevinzakka / recurrent-visual-attention

A PyTorch Implementation of "Recurrent Models of Visual Attention"
MIT License
469 stars 124 forks source link

Location embeddings #42

Closed darkknight314 closed 3 years ago

darkknight314 commented 3 years ago

The location embeddings within the glimpse network are generated as a 128-dimensional vector by passing in two inputs to an NN, the x, and y coordinates. Could someone kindly explain the rationale behind this decision?