Closed auspicious3000 closed 4 years ago
Decoder is supported. For the sake of illustration, it is used in a translation application. See this repo for details https://github.com/TurboNLP/Translate-Demo. In the case of Decoder inference, the speedups of Turbo to PyTorch are ranging from 1.85x-2.51x. See detail graph in Fig. 10 of https://arxiv.org/pdf/2010.05680.pdf
The document indicates the transformer decoder from openmnt is supported. However, the transformer decoder benchmark https://github.com/Tencent/TurboTransformers/blob/master/docs/decoder.md says "We are still working on decoder model optimization." I was wondering if plain transformer decoder acceleration is supported at this time? If so, how much is the performance gain? Thanks!
文档里面说普通的transformer decoder是支持的,但是在benchmark里面又说还有待开发。请问现在支持普通decoder的加速吗?如果支持的话,大概能加速多少呢?我用的不是bert或者gpt之类的标准模型,而是一个和attention is all you need里面类似的普通decoder. 非常感谢!