awslabs / sockeye

Sequence-to-sequence framework with a focus on Neural Machine Translation based on PyTorch
https://awslabs.github.io/sockeye/
Apache License 2.0
1.21k stars 323 forks source link

Add blocking cross-attention between decoder and encoded prepended tokens #1085

Closed xingniu closed 1 year ago

xingniu commented 1 year ago

Add blocking cross-attention between decoder and encoded prepended tokens. Prepended tokens are source tokens before a specified tag (inclusive).

Add a new dictionary-based prepared data format that supports storing length of prepended source tokens (version 7). The previous format (version 6) is still supported.

Pull Request Checklist

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.