Implement cuDNN softmax and log-softmax, add tests.

feiwang3311 / Lantern

BSD 3-Clause "New" or "Revised" License

168 stars 15 forks source link

Implement cuDNN softmax and log-softmax, add tests. #34

Closed dan-zheng closed 6 years ago

dan-zheng commented 6 years ago

Implement cuDNN softmax and log-softmax.
- The functions are currently limited to 2-D tensors.
Add tests, verifying both forward/backward results.
- Add CPU softmax and log-softmax tests to ensure parity.

Low priority todos:

Support softmax grad on CPU.
Implement softmax for arbitrary rank tensors. The API should look like softmax(axis: Int), matching other reduction ops.

dan-zheng commented 6 years ago

Btw, softmax calculations verified via PyTorch:

import torch
import torch.nn.functional as F

a = torch.Tensor(range(6)).reshape(2, 3)
a.requires_grad = True
b = F.log_softmax(a, dim=1)
b.backward(torch.ones(2, 3))

print(b)
# tensor([[-2.4076, -1.4076, -0.4076],
#         [-2.4076, -1.4076, -0.4076]], grad_fn=<LogSoftmaxBackward>)

print(a.grad)
# tensor([[ 0.7299,  0.2658, -0.9957],
#         [ 0.7299,  0.2658, -0.9957]])