Open pcuenca opened 1 year ago
Hi @zhiyuanzhai!
So far we are focused on improving performance of autoregressive models. Lessons learned could hopefully be transferred to encoder-decoder models down the line, although there are challenges to make attention caching work effectively.
Any updates?