-
**Is your feature request related to a problem? Please describe.**
We extend OpenAIChatGenerator for MistralChatGenerator. This works for chat completion but not for function calling. Mistral's funct…
-
Recently a new version of Mistral was announced which has a much larger 32K context:
https://www.reddit.com/r/LocalLLaMA/comments/1blzrfp/new_mistral_model_announced_7b_with_32k_context/
This mode…
-
### System Info / 系統信息
xinference, version 0.12.2.post1
CUDA Version: 12.4
### Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
- [ ] docker / docker
- [X] pip install / 通过 pip install…
-
Hi Team,
I have just installed Wren with Ollama with this config
`
LLM_PROVIDER=ollama_llm
GENERATION_MODEL=mistral-nemo:latest
EMBEDDER_PROVIDER=ollama_embedder
EMBEDDING_MODEL=mxbai-embed-…
-
Hi, thanks again for the amazing work here! When I try to fine tune the model with our sample data, I was able to initialize some parts of the training but I got the following issue related to "cpu "i…
-
Hi,
My config: A770 + Ollama + OpenWebui + intelanalytics/ipex-llm-inference-cpp-xpu:latest docker
After 2-3 chat message I get this error:
```ollama_llama_server: /home/runner/_work/llm.cpp/llm.…
-
Hello,
I was seeing warning during finetuning Mistral and tracked this line here
https://github.com/huggingface/alignment-handbook/blob/main/src/alignment/model_utils.py#L71
Because Mistral's…
-
### Your current environment
[My Environment](https://github.com/vllm-project/vllm/files/14937936/env.txt)
OpenAI API launched using this command:
```
VLLM_WORKER_MULTIPROC_METHOD=spawn VLLM_NCC…
-
Thank you for your excellent work! I'd like to ask about the differences among the various LLM templates. Specifically, how should I code a customized template for different LLMs, such as Mixtral-8*7B…
-
### Describe the issue
This issue is a place to discuss the impact of not being able to rely on the `name` field on messages and existing, or proposed, solutions to cater for this.
---
The `n…