seann999 / tensorflow_mnist_ram

An attempt to implement the recurrent attention model (RAM) from "Recurrent Models of Visual Attention" (Mnih+ 2014)
MIT License
44 stars 19 forks source link

why does the cost < 0? #1

Open qingzew opened 8 years ago

seann999 commented 8 years ago

Sorry about the late response. The cost is basically a negative log sum of rewarded glimpse locations and classic classification likelihoods. So the model not only learns to make better classification predictions, but also learns to make glimpse sequences that leads to better rewards. Usually, that means looking at relevant locations. For MNIST, that usually means looking at the white number instead of focusing on the black background. Cost is not to be confused with accuracy, which would always be >= 0.