-
### Your current environment
vllm version: 0.5.4
gpu 24GB memory
### 🐛 Describe the bug
```bash
CUDA_VISIBLE_DEVICES=0 vllm serve mistralai/Mistral-7B-Instruct-v0.3 --api-key yyy --port 1…
-
# 模型参数支持专区
大家好,PaddleNLP 团队在这里为大家整理了各个模型参数的详细信息,方便大家使用。
## 模型参数
### Base Models
| Model | 0.5B | 1~2B | 3~4B | 6~8B | 13~14B | 30~32B | 50~60B | 65~72B | 110B | >110B |
|:---------:|:--…
-
getting following error when running 'PGPT_PROFILES=ollama make run ' after fresh install (no cache).
OSError: You are trying to access a gated repo.
Make sure to have access to it at https://hugg…
-
# Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [x] I am running the latest code. Development is very rapid so there are no tagged versions as of…
-
I'm using a Mistral model and want to only train on responses. `train_on_responses_only` is supposed to only mask the user prompt, however, the following code masks **both** the user and assistant mes…
-
When i tried to load `unsloth/mistral-7b-instruct-v0.2` with below code i got error
```
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "unsloth/mistral-7b-instruct-v0.2", …
-
### Your current environment
The output of `python collect_env.py`
```text
Your output of `python collect_env.py` here
```
### 🐛 Describe the bug
Some models like mistralai/Mistr…
-
### Describe the bug
When max_tokens parameter is None, the agent send a frame /v1/chat/completions with max_tokens: null.
In this case the LLM don't understand and and stop after the second tok…
-
**Title:** Evaluation Code Produces Identical Results with Different Caching Methods
**Description:**
It seems the evaluation code leads to the same result with different caching methods. I used…
-
**Bug description**
When I create a RAG Application with SimilaritySearch, the search returns similar documents when using Azure OpenAI, but always returns zero documents with Ollama. The issue occur…