-
Hi all, thanks to the community effort, LLamaSharp has had much richer features than the beginning. Meanwhile, the distributions of backend package may be changed soon. Therefore I think it's time to …
-
## cli command
```bash
CUDA_VISIBLE_DEVICES=5 xtuner train internlm_chat_7b_qlora_colorist_e5.py
```
## content of `internlm_chat_7b_qlora_colorist_e5.py`
```python
# Copyright (c) OpenMMLab. …
-
I am trying to use Ollama with deepseek-coder-v2:16b. I get a status 400 error when I make an API request against the model. I also tested codegemma:7b-instruct-v1.1-q8_0, and am getting the same erro…
-
Qwen model FP32-INT4 precsion inference, input token size 2500
there are 2 issues found
1) Output tokens repeats
2) Reports Native API failed when running the same command in 2nd round
Platform…
-
这是我的运行代码:
python -m vllm.entrypoints.openai.api_server --served-model-name Qwen2-VL-7B-Instruct --model /home/wangll/llm/model_download_demo/models/Qwen/Qwen2-VL-7B-Instruct
以下是报错信息:
INFO 09-03 1…
-
### Model Series
Qwen2.5
### What are the models used?
qwen2.5-32b-instruct-gptq-int4 qwen2.5-32b-instruct-gptq-int8
### What is the scenario where the problem happened?
deployment with…
-
**例行检查**
[//]: # (方框内删除已有的空格,填 x 号)
+ [x] 我已确认目前没有类似 issue
+ [x] 我已确认我已升级到最新版本
+ [x] 我已完整查看过项目 README,尤其是常见问题部分
+ [x] 我理解并愿意跟进此 issue,协助测试和提供反馈
+ [x] 我理解并认可上述内容,并理解项目维护者精力有限,**不遵循规则的 issue 可能…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
```
python -m pip list
Package Version Editable project location
--------…
-
我按照:python3.9 -m vllm.entrypoints.openai.api_server --dtype auto --api-key $api_key --model $model_name_or_path --tensor-parallel-size 8部署qwen2训练的reward模型后,请求接口返回的是回复内容,怎么设置参数才能只获取reward的打分呢?
-
### Your current environment
vllm==0.6.1
### Model Input Dumps
when i use medusa train, medusa0,medusa1,medusa2 acc has 0.95, train result is ok,
but i try vllm to delpoy medusa, deploy is…