Serve Deberta using FasterTransformer in Triton

NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT

Apache License 2.0

5.85k stars 891 forks source link

Open sfc-gh-zhwang opened 1 year ago

sfc-gh-zhwang commented 1 year ago

Hi, Is there any tutorial that we can refer to so that we could serve a deberta model using fastertransformer in Triton? I think the steps would be:

However, I only see the step 1 with a tensorflow example.

sfc-gh-zhwang commented 1 year ago