[Feature Request] Add Simple Recurrent Unit (SRU)

apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

https://mxnet.apache.org

Apache License 2.0

20.78k stars 6.79k forks source link

[Feature Request] Add Simple Recurrent Unit (SRU) #13925

Open Ishitori opened 5 years ago

Ishitori commented 5 years ago

Description

Simple Recurrent Unit (SRU) is a new recurrent netowkr architecture that provides better parallelization compare to regular LSTM/GRU cells and in some cases better performance compare to CNN used for sequential data. The original code written in PyTorch + CUDA is open and available at https://github.com/taolei87/sru

It would be great to port this layer into MXNet as it seems to be a basic building block for other models.

@szha

piyushghai commented 5 years ago

@mxnet-label-bot Add [Feature Request]

pengzhao-intel commented 5 years ago

@Ishitori do you have an application using SRU now?

We have a CPU implementation in house. If your case works on CPU, we can consider to upstream the initial version. @ciyongch @TaoLv

eric-haibin-lin commented 5 years ago

+1 on this.

chinakook commented 5 years ago

These bar charts looks nice, but SRU need more layers to catch the accuracy of LSTM. In total, the speed would be nearly equalization because LSTM is rapidly developing in CUDNN.