triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.38k stars 1.49k forks source link

how to deploy BERT #7496

Open chenchunhui97 opened 3 months ago

chenchunhui97 commented 3 months ago

I want to deploy MaCBert but I cannot find some helpful blogs, do you have instructions about deployment of this model?

rmccorm4 commented 3 months ago

Hi @chenchunhui97, this example is pretty old, but may work: https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/LanguageModeling/BERT/triton/README.md