esvhd / notes

Various study notes
Other
0 stars 0 forks source link

A Theoretically Grounded Application of Dropout in RNN (2016) #1

Open esvhd opened 6 years ago

esvhd commented 6 years ago

paper by Gal & Ghahramani, 2016.

Lua code available here

Introduces Variational LSTM: the same dropout mask is used for all time steps in the recurrent steps.

dropout

Formulation

For untied-weights, different dropout masks could be used for different gates.

lstm

Monte Carlo (MC) Dropout:

Obtained by performing dropout at test time 1000 times, and averaging the model outputs following equation (4) in paper.

predict

Experiments / Results

results

esvhd commented 6 years ago

Found some nice notes here on this topic, and compares tensorflow and pytorch implementations.