-
### Describe the bug
我们在压测xinference时候发现,V100 2卡,调用/v1/chat/completions接口,stream参数是True,模型用qwen-14b-chat,用jmeter10并发进行压测,压测1分钟xinference就挂了,如果stream是False,是可以的.
### 报错日志
```
2024-07-08 11:34:3…
-
if yes then how ? :-))
-
### System Info / 系統信息
Ubuntu 22.04.4 LTS
python 3.10
transformer 4.43.0
cuda 12.0
torch 2.3.0
vllm 0.4.3
### Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
- [ ] docker / docke…
-
**问题描述 / Problem Description**
用简洁明了的语言描述这个问题 / Describe the problem in a clear and concise manner.
**复现问题的步骤 / Steps to Reproduce**
1. 执行 '...' / Run '...'
2. 点击 '...' / Click '...'
3. 滚动到 '..…
-
### System Info / 系統信息
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:36:15_Pacific_Daylight_Time_2023
Cuda compilation tools, release 12.1…
fpy10 updated
2 weeks ago
-
### System Info / 系統信息
chrome==123.0.6312.59
### Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
- [X] docker / docker
- [ ] pip install / 通过 pip install 安装
- [ ] installation from sou…
-
### Describe the bug
ChatTTS部署成功,界面可以看到。
![ChatTTS部署成功界面图](https://github.com/xorbitsai/inference/assets/17266912/b4bc826a-7c08-44ee-aec0-59c7e8a643c6)
日志:
2024-07-04 00:42:25,572 xinferen…
-
**问题描述 / Problem Description**
使用RAG对话没有历史功能。
![微信截图_20241009212711](https://github.com/user-attachments/assets/9bc21dff-dfb7-4e98-a3f5-14a051f60295)
**复现问题的步骤 / Steps to Reproduce**
1. 执行 '...'…
-
### Checklist
- [ ] 1. I have searched related issues but cannot get the expected help.
- [ ] 2. The bug has not been fixed in the latest version.
- [ ] 3. Please note that if the bug-related issue y…
-
### System Info / 系統信息
Ubuntu18.04
python==3.10
### Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
- [ ] docker / docker
- [X] pip install / 通过 pip install 安装
- [ ] installatio…