-
**Describe the bug**
do_sample 参数,不管是传递--do_sample True 还是 --do_sample true 日志都提示:`do_sample` is set to `False`. However, `temperature` is set to `0.0`
swift infer \
--model_type qwen2-vl-2…
-
### Your current environment
The output of `python collect_env.py`
```text
(base) root@DESKTOP-PEPA2G9:~# python collect_env.py
Collecting environment information...
/root/miniconda3/lib/py…
-
## Description
Hello, I am performing int8 quantization on a BERT-like embedding model. I noticed that after quantization, the inference speed is much more slower than FP16, and the output of the t…
-
打开图片目录后并开启ai自动标注再进行切换模型后之前的模型还会一直占用内存直到内存耗尽报错
![2](https://github.com/user-attachments/assets/1f508867-c967-47e6-9190-fe212e16aaef)
![1](https://github.com/user-attachments/assets/09f97ddb-2006-468d…
-
### 🐛 Describe the bug
The following script results in an error when run with vllm [0.6.1-post2](https://github.com/vllm-project/vllm/releases/tag/v0.6.1.post2) and PyTorch 2.4.
The model is usi…
-
### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
### 该问题是否在FAQ中有解答? | Is there an existing ans…
-
model: qwen1.5 14b chat
auto_gptq : 0.8.0dev+cu118,0.7.0dev+cu118
quantize code:
```
quantize_config = BaseQuantizeConfig(
bits=4, # quantize model to 4-bit
group_size=128, # it is rec…
-
Hello,
I download the model(NCSOFT/Llama-3-OffsetBias-RM-8B) from hugginface。
and then run the code below:
```
pip install -r requirements.txt
```
and then
```
from module import VllmModule
…
-
### Search before asking
- [X] I had searched in the [issues](https://github.com/eosphoros-ai/DB-GPT/issues?q=is%3Aissue) and found no similar issues.
### Operating system information
Linux
### P…
-
Hello,
First of all thank you for bringing this amazing tool! I was wondering if there is any chance of integrating open-source LMM models like for example https://huggingface.co/Qwen/Qwen2-VL-7B-…