RNN support for ONNX-Chainer

sw005320 commented 6 years ago

I just raised it, as we intensively use RNN (LSTM) for attention-based end-to-end ASR https://github.com/espnet/espnet with chainer and pytorch as a backend, and want to unify these two backends with the ONNX framework to some extent. I really appreciate if you answer when and how it is supported.

mitmul commented 6 years ago

It's still in an exploratory stage to find a good way to identify an LSTM block from a computational graph dynamically generated. In current Chainer, an LSTM function consists of several Linear functions and activation functions, and it appears as a series of those functions in the resulting computational graph. So, we need to annotate the LSTM part during a forward pass computation, but there's no way to do that now. On the other hand, current ONNX requires the batchsize and the sequence length at the same time to describe a single LSTM operator, but both may change in actual inference programs, so I think the current ONNX is not enough to represent RNN. But we also want to support RNN export into ONNX, so we will continue to try to find a good way. If you have any ideas, we welcome your suggestions of solutions on those problems.

sw005320 commented 6 years ago

I see the point. I'm sorry that I don't have a nice solution about it right now, but will let you know once I come up with something. Thanks for your answer.

chainer / onnx-chainer

RNN support for ONNX-Chainer #13