jihunchoi / recurrent-batch-normalization-pytorch

PyTorch implementation of recurrent batch normalization
243 stars 34 forks source link

Size mismatch when using bnlstm and layer>1 - solution #6

Closed danarte closed 6 years ago

danarte commented 6 years ago

Hello, I was using your implementation for my project and have encountered a size mismatch error when using bnlstm and 2 layers.

The mismatch happens because the second layer is initiated with "input size" parameter of "hidden-size" (You can see it here: https://github.com/jihunchoi/recurrent-batch-normalization-pytorch/blob/master/bnlstm.py#L237) but the forward function still passes the regular input which is of size "input size" (you can see this here: https://github.com/jihunchoi/recurrent-batch-normalization-pytorch/blob/master/bnlstm.py#L288) So at this line of code: https://github.com/jihunchoi/recurrent-batch-normalization-pytorch/blob/master/bnlstm.py#L211 The input_ variable is of size (inputsize * something) but the second layer expects input variable to be of size (hidden * something).

To solve this issue I changed lines 288-289 to this:

            if layer == 0:
                layer_output, (layer_h_n, layer_c_n) = LSTM._forward_rnn(
                    cell=cell, input_=input_, length=length, hx=hx)
            else:
                layer_output, (layer_h_n, layer_c_n) = LSTM._forward_rnn(
                    cell=cell, input_=layer_output, length=length, hx=hx)

Thanks for the publishing the implementation. You can close this issue when/if you wish.

P.s: By the way, I would appreciate if you could find a possible solution to the other open issue about reset_parameters, or maybe we could say that the reset_parameters does not affect the network too much so we can just make it default and forget it.

jihunchoi commented 6 years ago

Thanks for pointing out the bug! It would be way more great, if you make a PR that fixes the issue. :)

On 10 Jan 2018, at 10:03 PM, danarte notifications@github.com<mailto:notifications@github.com> wrote:

Hello, I was using your implementation for my project and have encountered a size mismatch error when using bnlstm and 2 layers.

The mismatch happens because the second layer is initiated with "input size" parameter of "hidden-size" (You can see it here: https://github.com/jihunchoi/recurrent-batch-normalization-pytorch/blob/master/bnlstm.py#L237) but the forward function still passes the regular input which is of size "input size" (you can see this here: https://github.com/jihunchoi/recurrent-batch-normalization-pytorch/blob/master/bnlstm.py#L288) So at this line of code: https://github.com/jihunchoi/recurrent-batch-normalization-pytorch/blob/master/bnlstm.py#L211 The self.weight_ih variable is of size (input_size something) but the actual layer expects self.weight_ih variable to be of size (hidden something).

To solve this issue I changed lines 288-289 to this:

        if layer == 0:
            layer_output, (layer_h_n, layer_c_n) = LSTM._forward_rnn(
                cell=cell, input_=input_, length=length, hx=hx)
        else:
            layer_output, (layer_h_n, layer_c_n) = LSTM._forward_rnn(
                cell=cell, input_=layer_output, length=length, hx=hx)

Thanks for the publishing the implementation. You can close this issue when/if you wish.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/jihunchoi/recurrent-batch-normalization-pytorch/issues/6, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABz4BSpDSySoW3sxERI6urpD3_ddW0A6ks5tJLURgaJpZM4RZSXa.