Closed dan-zheng closed 5 years ago
Reference PyTorch RNN program:
import torch
import torch.nn as nn
import torch.nn.init as init
import torch.nn.functional as F
input_size = 10
hidden_size = 40
num_layers = 2
seq_length = 5
batch_size = 3
bidirectional = False
num_directions = 2 if bidirectional else 1
rnn = nn.RNN(input_size, hidden_size, num_layers, nonlinearity='relu', bias=True, dropout=0, bidirectional=bidirectional)
for p in rnn.parameters():
init.constant_(p, 0.01)
input = torch.ones(seq_length, batch_size, input_size)
input.requires_grad = True
h0 = torch.ones(num_layers * num_directions, batch_size, hidden_size)
output, hn = rnn(input, h0)
# Lantern produces the same output value.
output.backward(torch.ones_like(output))
print(input.grad)
# Lantern produces the same input gradient value.
for p in rnn.parameters():
print(p.grad)
# Lantern doesn't produce the same parameter gradient values yet.
This PR is a WIP, created to show progress and receive feedback. Please don't merge yet.
cudnnRNNForwardInference
andcudnnRNNForwardTraining
.Rnn
module is designed to match PyTorch's recurrent layers, e.g.nn.RNN
andnn.LSTM
. It facilitates PyTorch-style model building.cudnnRNNBackwardData
seems to work.cudnnRNNBackwardWeights
needs debugging, the values are incorrect.Module.registerParameters
to correctly register arrays of parameters (ArrayBuffer[TensorR]
).