pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
83.86k stars 22.61k forks source link

Add Minimal Gated Unit (MGU) #55030

Open carlosgmartin opened 3 years ago

carlosgmartin commented 3 years ago

🚀 Feature

Add Minimal Gated Unit (MGU).

Motivation

From Minimal gated unit for recurrent neural networks (171 citations):

We propose a gated unit for RNN, named as Minimal Gated Unit (MGU), since it only contains one gate, which is a minimal design among all gated hidden units. The design of MGU benefits from evaluation results on LSTM and GRU in the literature. Experiments on various sequence data show that MGU has comparable accuracy with GRU, but has a simpler structure, fewer parameters, and faster training. Hence, MGU is suitable in RNN's applications. Its simple architecture also means that it is easier to evaluate and tune, and in principle it is easier to study MGU's properties theoretically and empirically.

Pitch

Add

cc @albanD @mruberry @jbschlosser

jbschlosser commented 3 years ago

Hey @carlosgmartin, thanks for the suggestion! Note that we try to maximize value provided for modules we adopt into PyTorch core. As this is a relatively old paper that never reached the popularity of other RNN variations, it's a bit unclear to me at this point if it should be in core. However, if this request becomes popular enough, we can certainly reconsider this later on.

If you need a performant MGU implementation for your work, see this blog post discussing implementation of performant custom RNN cells / layers within PyTorch, with referenced example code here.