Separate initialize_containers and initialize_parameters to show correct usage. If you don't, then the options framework won't work, since sometimes container construction requires options to be set (LSTM(2, 2).nlayers().make()). This implies the correct usage more.
Added GRUs, LSTMs, and RNNs with a tanh or relu nonlinearity
Tests LSTMs against pytorch implementation, and tests all against a stupid counting example (number of 1s in a string of 5 binary numbers)
Unimplemented:
Bidirectional
Batch first
Fused non-CUDNN CUDA kernels (not bound in PyTorch ATen)
Changes:
LSTM(2, 2).nlayers().make()
). This implies the correct usage more.Unimplemented: