lisa-groundhog / GroundHog

Library for implementing RNNs with Theano
BSD 3-Clause "New" or "Revised" License
598 stars 229 forks source link

About LSTM Layer in GroundHog #23

Closed gaoyuankidult closed 9 years ago

gaoyuankidult commented 9 years ago

Hello

I have checked the LSTM Layer in GroundHog / groundhog / layers / rec_layers.py. I wonder is this a complete standard LSTM layer(e.g. described in Alex Grave's paper, http://arxiv.org/pdf/1308.0850v5.pdf ) or a prototype for now?

I didn't see bias term in # input/output gate update. did I miss it? btw. do you have an example using this layer?

thanks.

btw. I can write a wiki(tutorial) about it, If there is some example.

formulas described in paper:

gh

code of LSTM layer:

hg_code

rizar commented 9 years ago

@kyunghyuncho , you wrote the LSTM layer.

kyunghyuncho commented 9 years ago

Sorry about the late reply!

As @gaoyuankidult correctly noticed, the implementation in GoundHog lacks the bias terms for the gaters, which I believe wouldn't make much difference. Also, note that there are a number of variants of LSTMs (see, e.g., http://arxiv.org/pdf/1409.2329.pdf).

You can train a neural machine translation model using LSTM by

    state['enc_rec_layer'] = 'LSTMLayer'                                                          
    state['enc_rec_gating'] = False
    state['enc_rec_reseting'] = False
    state['dec_rec_layer'] = 'LSTMLayer'                                                          
    state['dec_rec_gating'] = False
    state['dec_rec_reseting'] = False                                                             
    state['dim_mult'] = 4

Though, this feature has been tested only internally quite some time ago. If you run into any issues with this, please, leave a comment here about the details and I'll take a look into it.

infinitezxc commented 9 years ago

Hello @kyunghyuncho

When I am trying to train a model using LSTM, I run into this error:

ValueError: dimension mismatch in args to gemm (512,2000)x(1000,1000)->(512,1000)
Apply node that caused the error: GpuDot22(GpuReshape{2}.0, W_0_dec_repr_readout)
Inputs types: [CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix)]
Inputs shapes: [(512, 2000), (1000, 1000)]
Inputs strides: [(2000, 1), (1000, 1)]
Inputs values: ['not shown', 'not shown']

Do you have any suggestions? Thanks!

kyunghyuncho commented 9 years ago

Can you provide your state variables?

On Thu, Jan 29, 2015 at 1:36 PM, infinitezxc notifications@github.com wrote:

Hello @kyunghyuncho https://github.com/kyunghyuncho

When I am trying to train a model using LSTM, I run into this error: ‘’‘ ValueError: dimension mismatch in args to gemm (512,2000)x(1000,1000)->(512,1000) Apply node that caused the error: GpuDot22(GpuReshape{2}.0, W_0_dec_repr_readout) Inputs types: [CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix)] Inputs shapes: [(512, 2000), (1000, 1000)] Inputs strides: [(2000, 1), (1000, 1)] Inputs values: ['not shown', 'not shown'] ’‘’ Do you have any suggestions? Thanks!

— Reply to this email directly or view it on GitHub https://github.com/lisa-groundhog/GroundHog/issues/23#issuecomment-72078604 .

infinitezxc commented 9 years ago

Thanks for the quick reply :) @kyunghyuncho

I am using the default states in prototype_phrase_lstm_state(), with the source and target paths modified and adding a prototype_phrase_lstm_state line at the end of init.py.

guxd commented 8 years ago

@infinitezxc Have you solved the LSTM error? I met the error too.