-
**Use OpenAI compatible servers**
A lot of recent frameworks (llama.cpp, vLLM, and other...) make their models available through an OpenAI compatible API.
I think it would be awesome if we could us…
-
Hi!
It's great to be able to work with some API like OpenAI, but some people, like student, doesn't necessarily have money to spend in these services. Mistral is one of the rare firms that offers, …
-
### Feature request
Multi lora support in TGI has been around since 2.0.6, but it is not compatible with the Messages API using the openai package.
### Motivation
The openai chat completion approa…
-
### Which API Provider are you using?
OpenAI Compatible
### Which Model are you using?
o1-mini
### What happened?
o1 series model available
### Steps to reproduce
![image](https://github.com/us…
-
**Issue Title:**
Sample code fails with `APIRemovedInV1` error due to OpenAI API changes in `openai` package version >=1.0.0
**Issue Body:**
**Description:**
While running the sample code provided i…
-
At the moment, the inference APIs (`chatComplete` and `output`) don't provide any way to perform cancelation of a running request / call.
Technically, the genAI stack connectors all support passing …
-
It would be extremely convenient to have such api, to use a lot of existing tools.
-
### Which API Provider are you using?
OpenAI Compatible
### Which Model are you using?
Claude 3.5 Sonnet
### What happened?
![PixPin_2024-11-08_11-38-36](https://github.com/user-attachments/asset…
-
## What would you like to be added:
## Why is this needed:
## Anything else we need to know?
-
### What is the issue?
The streamed chat-completion response from ollama's openai-compatible API repeats `"role": "assistant"` in all returned chunks. This is different to OpenAI's API which just has…