triton-inference-server / client

Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.
BSD 3-Clause "New" or "Revised" License
517 stars 224 forks source link

Document how to profile embeddings models #717

Closed dyastremsky closed 1 week ago

dyastremsky commented 1 week ago

Document a quick example of how to run an embedding model on an OpenAI Embedding API-compatible server.