andersbll / cudarray

CUDA-based NumPy
MIT License
233 stars 61 forks source link

Cudnn 5 Support for GRU and LSTM cells #48

Open Henry-Chinner opened 8 years ago

Henry-Chinner commented 8 years ago

Cudnn 5 now supports GRU and LSTM cells natively, with speedup's of up to 6 times compared to a Cublass implementation of GRU's and LSTMs. IT would be great if Cudarray could support this.

https://devblogs.nvidia.com/parallelforall/optimizing-recurrent-neural-networks-cudnn-5/

andersbll commented 8 years ago

For the record, It is possible to build GRU/LSTM with vanilla NumPy/CUDArrray.

You are right, it would be nice to have optimized versions of these functions. Though, I think it is equally important to implement parallel array operations as you discuss in issue #43. This should allow us to take advantage of model parallelism in recurrent networks.

Henry-Chinner commented 8 years ago

Oh yes, I have implemented GRU/LSTMs with Cudarray and it works quite satisfactory.

I agree with you, parallelization of array ops will be a massive help for LSTM/GRU's. Especially when they are stacked.

Maybe optimized implementations of LSTM/GRU blocks it a too high level of abstraction for Cudarray.

andersbll commented 8 years ago

Maybe optimized implementations of LSTM/GRU blocks it a too high level of abstraction for Cudarray.

There is always plenty of room in the cudarray.nnet submodule. :)

, but I get your point. Ideally, CUDArray should be easier to extend with external modules. One could then imagine a cuDNN module that exposes the operations with CUDArray compatibiltiy.