feiwang3311 / Lantern

BSD 3-Clause "New" or "Revised" License
167 stars 15 forks source link

[WIP] Implement cuDNN RNN ops and RNN module. #44

Closed dan-zheng closed 5 years ago

dan-zheng commented 5 years ago

This PR is a WIP, created to show progress and receive feedback. Please don't merge yet.


dan-zheng commented 5 years ago

Reference PyTorch RNN program:

import torch
import torch.nn as nn
import torch.nn.init as init
import torch.nn.functional as F

input_size = 10
hidden_size = 40
num_layers = 2
seq_length = 5
batch_size = 3
bidirectional = False
num_directions = 2 if bidirectional else 1

rnn = nn.RNN(input_size, hidden_size, num_layers, nonlinearity='relu', bias=True, dropout=0, bidirectional=bidirectional)
for p in rnn.parameters():
    init.constant_(p, 0.01)
input = torch.ones(seq_length, batch_size, input_size)
input.requires_grad = True
h0 = torch.ones(num_layers * num_directions, batch_size, hidden_size)
output, hn = rnn(input, h0)
# Lantern produces the same output value.

output.backward(torch.ones_like(output))
print(input.grad)
# Lantern produces the same input gradient value.
for p in rnn.parameters():
    print(p.grad)
    # Lantern doesn't produce the same parameter gradient values yet.