Log probs calculation is wrong

When calculating the lob probablity of the sample the code currently doesn't take into account that a non-linerarity has occured.

Speicifically: https://github.com/kevinzakka/recurrent-visual-attention/blob/master/model.py#L109

Assumes an untransformed normal distribution. But the sample variables, l_t, has been transformed: https://github.com/kevinzakka/recurrent-visual-attention/blob/master/modules.py#L350

The easy solution to this is calculate the log probs prior applying the non-linearity. Therefore making the location_network return the log_probs and l_t (mu is no longer needed).

This probably hasn't had much of an effect if you're in the linear region of tanh its fine, however it is theoretically incorrect.

kevinzakka / recurrent-visual-attention

Log probs calculation is wrong #26