Closed dave-rtzr closed 2 years ago
Hi,
It seems like lightseq triton backend does not support transformer decoder only models.
I want to feed triton with tensors generated by encoder layers from other backends (such as onnx) and decode it with lightseq decoder.
I am doing this because I want to use other forms of encoder layers that are not supported by lightseq.
Is there a way to use pure decoders (with encoder-decoder cross attention available, unlike gpt2) on triton servers?
Sorry, but we have no plans to support these models in the near future. The kind of situation you describe requires a deep code modification
Hi,
It seems like lightseq triton backend does not support transformer decoder only models.
I want to feed triton with tensors generated by encoder layers from other backends (such as onnx) and decode it with lightseq decoder.
I am doing this because I want to use other forms of encoder layers that are not supported by lightseq.
Is there a way to use pure decoders (with encoder-decoder cross attention available, unlike gpt2) on triton servers?