Open visitsb opened 5 months ago
@visitsb It's fine to add /v1/models
. But the list of full openai api is long, like /v1/audio
, /v1/embedding
. What's the minimal subset is needed?
@npuichigo Thanks for the quick reply!
Are you able to add below? Looking at open-webui's implementation, at minimum-
/v1//models
/v1//chat/completions
/v1/embeddings
/v1/audio/speech
/v1//audio/transcriptions
Wish there was an easier way to provide full compatibility, but sometime in future.
The exposed API depends on the actual model hosted in triton backend. Since there's no embedding model available in trtllm, /v1/embeddings
is not possible. For embedding model, maybe you can refer to https://github.com/huggingface/text-embeddings-inference.
The same reason applies to /v1/audio/*
since no ASR and TTS models are available now in trtllm.
@npuichigo I am trying to use Triton Inference Server with TensorRT-LLM backend with openweb-ui as frontend, but not all routes are provided, e.g.
/v1/models
etc.Is there any plan to support all openapi v1 routes?
It will be really great if full openai api support is available, since kserve is still under works.