We're hiring! If you like what we're building here, come join us at LMNT.
Haste is a CUDA implementation of fused RNN layers with built-in DropConnect and Zoneout regularization. These layers are exposed through C++ and Python APIs for easy integration into your own projects or machine learning frameworks.
Which RNN types are supported?
What's included in this project?
libhaste
)haste_tf
)haste_pytorch
)libhaste
For questions or feedback about Haste, please open an issue on GitHub or send us an email at haste@lmnt.com.
Here's what you'll need to get started:
Once you have the prerequisites, you can install with pip or by building the source code.
pip install haste_pytorch
pip install haste_tf
make # Build everything
make haste # ;) Build C++ API
make haste_tf # Build TensorFlow API
make haste_pytorch # Build PyTorch API
make examples
make benchmarks
If you built the TensorFlow or PyTorch API, install it with pip
:
pip install haste_tf-*.whl
pip install haste_pytorch-*.whl
If the CUDA Toolkit that you're building against is not in /usr/local/cuda
, you must specify the
$CUDA_HOME
environment variable before running make:
CUDA_HOME=/usr/local/cuda-10.2 make
Our LSTM and GRU benchmarks indicate that Haste has the fastest publicly available implementation for nearly all problem sizes. The following charts show our LSTM results, but the GRU results are qualitatively similar.
Here is our complete LSTM benchmark result grid:
N=1 C=64
N=1 C=128
N=1 C=256
N=1 C=512
N=32 C=64
N=32 C=128
N=32 C=256
N=32 C=512
N=64 C=64
N=64 C=128
N=64 C=256
N=64 C=512
N=128 C=64
N=128 C=128
N=128 C=256
N=128 C=512
import haste_tf as haste
gru_layer = haste.GRU(num_units=256, direction='bidirectional', zoneout=0.1, dropout=0.05)
indrnn_layer = haste.IndRNN(num_units=256, direction='bidirectional', zoneout=0.1)
lstm_layer = haste.LSTM(num_units=256, direction='bidirectional', zoneout=0.1, dropout=0.05)
norm_gru_layer = haste.LayerNormGRU(num_units=256, direction='bidirectional', zoneout=0.1, dropout=0.05)
norm_lstm_layer = haste.LayerNormLSTM(num_units=256, direction='bidirectional', zoneout=0.1, dropout=0.05)
# `x` is a tensor with shape [N,T,C]
x = tf.random.normal([5, 25, 128])
y, state = gru_layer(x, training=True)
y, state = indrnn_layer(x, training=True)
y, state = lstm_layer(x, training=True)
y, state = norm_gru_layer(x, training=True)
y, state = norm_lstm_layer(x, training=True)
The TensorFlow Python API is documented in docs/tf/haste_tf.md
.
import torch
import haste_pytorch as haste
gru_layer = haste.GRU(input_size=128, hidden_size=256, zoneout=0.1, dropout=0.05)
indrnn_layer = haste.IndRNN(input_size=128, hidden_size=256, zoneout=0.1)
lstm_layer = haste.LSTM(input_size=128, hidden_size=256, zoneout=0.1, dropout=0.05)
norm_gru_layer = haste.LayerNormGRU(input_size=128, hidden_size=256, zoneout=0.1, dropout=0.05)
norm_lstm_layer = haste.LayerNormLSTM(input_size=128, hidden_size=256, zoneout=0.1, dropout=0.05)
gru_layer.cuda()
indrnn_layer.cuda()
lstm_layer.cuda()
norm_gru_layer.cuda()
norm_lstm_layer.cuda()
# `x` is a CUDA tensor with shape [T,N,C]
x = torch.rand([25, 5, 128]).cuda()
y, state = gru_layer(x)
y, state = indrnn_layer(x)
y, state = lstm_layer(x)
y, state = norm_gru_layer(x)
y, state = norm_lstm_layer(x)
The PyTorch API is documented in docs/pytorch/haste_pytorch.md
.
The C++ API is documented in lib/haste/*.h
and there are code samples in examples/
.
benchmarks/
: programs to evaluate performance of RNN implementationsdocs/tf/
: API reference documentation for haste_tf
docs/pytorch/
: API reference documentation for haste_pytorch
examples/
: examples for writing your own C++ inference / training code using libhaste
frameworks/tf/
: TensorFlow Python API and custom op codeframeworks/pytorch/
: PyTorch API and custom op codelib/
: CUDA kernels and C++ APIvalidation/
: scripts to validate output and gradients of RNN layers1406.1078v1
(same as cuDNN) rather than 1406.1078v3
To cite this work, please use the following BibTeX entry:
@misc{haste2020,
title = {Haste: a fast, simple, and open RNN library},
author = {Sharvil Nanavati},
year = 2020,
month = "Jan",
howpublished = {\url{https://github.com/lmnt-com/haste/}},
}