Open ppotash opened 8 years ago
Hi Peter, Unfortunately, I did not implement dropout in actual (although there is a dropout_layer method I forgot to delete...)
If you'd like to add a dropout layer(s), probably adding dropout layers: between input and encoder layers: _lstm function (line 271) between decoder and output layers: _ptr_probs (line 287)
the similar way as in this code
cheers xiaoxi
Hi Xiaoxi,
Thanks for the response. If I put it in the lstm function, I need some switch to say whether I'm in training or testing mode, right? And this would also need to be an extra parameter to the function, I imagine.
-Peter
Ok, I see how you do it from the code you linked to. Very 'theanic' :).
Yes, very theanic ;)
and maybe here is a better solution: theano_toolkit
-- Edit -- Sorry, wrong link neural-turing-machines)
A smarter way using closure to build layers
I'm still confused about using dropout in this implementation. Does trng need to be passed to the lstm function directly as an extra parameter? I tried that and got this error:
theano.tensor.var.AsTensorError: ('Cannot convert <theano.sandbox.rng_mrg.MRG_RandomStreams object at 0x7f1c87425890> to TensorType', <class 'theano.sandbox.rng_mrg.MRG_RandomStreams'>)
I also tried leaving it as a global variable in the ptr_network function, and that didn't work (and similarly tried initializing it in build_model and passing it as a parameter to the ptr_network function).
-Peter
I guess you use an integer as the flag for training/testing, it should be a float, e.g.
tensor.switch(is_training, ...)
the is_training
flag should be 1.0/0.0 instead of 1/0
I'm talking about the RandomStreams instance used for the binomial function.
-Peter On Sep 20, 2016 1:13 AM, "Xiaoxi Wang" notifications@github.com wrote:
I guess you use an integer as the flag for training/testing, it should be a float, e.g. tensor.switch(is_training, ...) the is_training flag should be 1.0/0.0 instead of 1/0
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/vshallc/PtrNets/issues/2#issuecomment-248203292, or mute the thread https://github.com/notifications/unsubscribe-auth/AD8F-tY8eOnz--i5e-M0IJO9evJOkwXVks5qr2tkgaJpZM4J9YNW .
Fyi, I posted about this on stackoverflow: http://stackoverflow.com/questions/39606372/dropout-in-scan-theano
On Tue, Sep 20, 2016 at 1:24 AM, Peter Potash pjpotash@gmail.com wrote:
I'm talking about the RandomStreams instance used for the binomial function.
-Peter On Sep 20, 2016 1:13 AM, "Xiaoxi Wang" notifications@github.com wrote:
I guess you use an integer as the flag for training/testing, it should be a float, e.g. tensor.switch(is_training, ...) the is_training flag should be 1.0/0.0 instead of 1/0
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/vshallc/PtrNets/issues/2#issuecomment-248203292, or mute the thread https://github.com/notifications/unsubscribe-auth/AD8F-tY8eOnz--i5e-M0IJO9evJOkwXVks5qr2tkgaJpZM4J9YNW .
I 've tried to change dropout_layer
into:
def dropout_layer(state_before, use_noise, trng, shape):
proj = tensor.switch(use_noise,
(state_before *
#trng.binomial(state_before.shape,
trng.binomial(shape,
p=0.5, n=1,
dtype=state_before.dtype)),
state_before * 0.5)
return proj
and call it as:
h = dropout_layer(h, 1.0, trng, (options['batch_size'], options['dim_proj']))
and comment these line in get_minibatches_idx
function to match the shape
def get_minibatches_idx(n, minibatch_size, shuffle=False):
"""
Used to shuffle the dataset at each iteration.
"""
idx_list = numpy.arange(n, dtype="int32")
if shuffle:
numpy.random.shuffle(idx_list)
minibatches = []
minibatch_start = 0
for i in range(n // minibatch_size):
minibatches.append(idx_list[minibatch_start:minibatch_start + minibatch_size])
minibatch_start += minibatch_size
''' remove these lines
if minibatch_start != n:
# Make a minibatch out of what is left
minibatches.append(idx_list[minibatch_start:])
'''
return zip(range(len(minibatches)), minibatches)
It works in the training stage. But for evaluation stage, it failed because the shape will be (options['beam_width'], options['dim_proj'])
. So I guess a temporary solution would be building another f_encode method for this shape for beam search...
Hi,
Great code and thanks for sharing :). How do you run the model with dropout? It doesn't seem to actually be implemented in the training process.
-Peter