dmlc / gluon-nlp

NLP made easy
https://nlp.gluon.ai/
Apache License 2.0
2.56k stars 538 forks source link

Implementation of Incremental Decoding #1582

Open barry-jin opened 2 years ago

barry-jin commented 2 years ago

Description

The implementation of incremental decoding in gluon-nlp is somewhat different from fairseq. In fairseq, the keys/values both before and after linear projection are memorialized, but in gluon-nlp, only the keys/values before the linear projection is memorialized. This difference leads to different execution number of FC operators (In fairseq, keys/values are directly pulled from prev_keys/prev_values; In gluon-nlp, two more linear projections are needed to get the projectioned keys/values). We may need to correct the gluon-nlp's implementation of incremental decoding.