THUNLP-MT / THUMT

An open-source neural machine translation toolkit developed by Tsinghua Natural Language Processing Group
BSD 3-Clause "New" or "Revised" License
701 stars 197 forks source link

Add encdec_attention cache to transformer.py to speed up inference. #116

Open liushaokong opened 2 years ago

liushaokong commented 2 years ago
  1. The encdec_attention is added to model/transformer.py, it is helpful to speed up inference.
  2. When we convert the pytorch model (model.pt) to onnx models (like fastt5), it is necesaary to show how the encdec attention is used.