kevinzakka / recurrent-visual-attention

A PyTorch Implementation of "Recurrent Models of Visual Attention"
MIT License
468 stars 123 forks source link

Is the M parameter necessary as it is not mentioned in the paper? (the number of monte carlo sample when validation&test?) #25

Closed LuxxxLucy closed 4 years ago

LuxxxLucy commented 4 years ago

In this implementation, there is a M parameter in validation and test mode that duplicate the input. The same input instance is processed by the REM model multiple times and the prediction is averaged.

When I remove this part so that there is no dupilcate of instance and no averaging (just as in train mode), the performance seems to have a huge drop. This should indicate that the convergence of performance is in fact much slower.

Also, the use of multiple duplicate of input instance, as mentioned as Monte Carlo Sample in your code, seems not to be a necessary part in the paper. The paper didn't mention it(correct me if I am wrong.)

Is this a design choice in order to augment the test-time performance? @kevinzakka

kevinzakka commented 4 years ago

Hey @LuxxxLucy, if I recall correctly, this is a trick that wasn't mentioned in this paper but in their followup: https://arxiv.org/pdf/1412.7755.pdf (top of page 5). And yeah, it's used to average the prediction and obtain better results.

LuxxxLucy commented 4 years ago

@kevinzakka Thank you very much. And this repo is a really very good and clean implementation.