I have experienced a "strange" behavior when playing with the code.
Using my own dataset, not very large, and different architectures/parameter sets: "classic LSTM", "LSTM+attention", etc / num_layer=2, 4, ... num_units=256, 512, ... etc I noticed that the RAM of my computer increased continuously when using any attention mechanism.
However this behavior does not show up when using setting "pass_hidden_state" to False.
Is it somehow linked? and "expected"? I would be grateful to anyone with an explanation.
Hi guys and thanks for the tutorial/great API.
I have experienced a "strange" behavior when playing with the code. Using my own dataset, not very large, and different architectures/parameter sets: "classic LSTM", "LSTM+attention", etc / num_layer=2, 4, ... num_units=256, 512, ... etc I noticed that the RAM of my computer increased continuously when using any attention mechanism.
However this behavior does not show up when using setting "pass_hidden_state" to False. Is it somehow linked? and "expected"? I would be grateful to anyone with an explanation.
(Using tf-1.2)