issues
search
xusenlinzy
/
api-for-open-llm
Openai style api for open large language models, using LLMs just as chatgpt! Support for LLaMA, LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, Xverse, SqlCoder, CodeLLaMA, ChatGLM, ChatGLM2, ChatGLM3 etc. 开源大模型的统一后端接口
Apache License 2.0
2.36k
stars
270
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
4*4090 显卡部署glm4-9b 使用dify 的api调用报错
#315
he498
opened
1 month ago
1
框架vllm输出截断,但是官方vllm启动和transformers运行模型都不
#314
TLL1213
opened
2 months ago
2
support cogvlm2 model
#313
white2018
closed
2 months ago
0
Support MiniMonkey model
#312
santlchogva
closed
2 months ago
0
运行glm4v请求报错
#311
760485464
opened
2 months ago
2
执行streamlit_app.py报错
#310
louan1998
closed
2 months ago
0
not support sglang backend
#309
colinsongf
opened
3 months ago
0
TASKS=llm,rag模式下,出现线程问题报错:RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
#308
syusama
opened
3 months ago
3
部署gte-qwen2-1.5b-instruct请求rerank接口报错
#307
cowcomic
opened
3 months ago
0
vllm 接口支持vision(minicpm-v)
#306
baisong666
opened
3 months ago
0
docker运行报错:multiproc_worker_utils.py:226] RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
#305
syusama
closed
3 months ago
3
python: can't open file '/workspace/api/server.py': [Errno 2] No such file or directory,Ubuntu上docker-compose部署Qwen2-72B-Instruct-GPTQ-Int4报错
#304
syusama
closed
3 months ago
0
使用Qwen2-7B-Instrut模型出现问题-使用Vllm
#303
Empress7211
closed
3 months ago
3
RuntimeError: CUDA error: device-side assert triggered
#302
ChaoPeng13
closed
3 months ago
0
💡 [REQUEST] - 请问可以支持中国电信大模型Telechat吗?流程可以跑通,但是回复content会被截断
#301
Song345381185
opened
4 months ago
9
💡 [REQUEST] - 请问可以支持中国电信大模型Telechat吗?流程可以跑通,但是回复content会被截断
#300
Song345381185
closed
4 months ago
0
doc chat 使用时报 FileNotFoundError: Table does not exist.Please first call db.create_table(, data) 错误
#299
Weiqiang-Li
closed
4 months ago
1
Update server.py Linux fix
#298
Zeknes
opened
4 months ago
0
【embedding】是不支持最新的SOTA模型吗 ?KeyError: 'Could not automatically map text2vec-base-multilingual to a tokeniser.
#297
ForgetThatNight
closed
4 months ago
2
llama3-8B回答后自我交流,不停止
#296
yd9038074
opened
4 months ago
1
dify调用chatglm4-chat接口报错500(TypeError: object of type 'int' has no len())
#295
besthong999
closed
4 months ago
5
fix glm4 chat forward error
#294
AinzLimuru
closed
4 months ago
1
qwen2推理报错
#293
wj1017090777
closed
4 months ago
10
minicpm启动没问题,推理访问报错
#292
760485464
opened
5 months ago
2
glm-4v启动正常 访问推理报错
#291
760485464
opened
5 months ago
10
使用api-for-open-llm&vllm多卡部署运行Qwen2-7B时报错显存占满
#290
Woiea
closed
4 months ago
5
change the parameter best_of of vllm chat_completion
#289
Tendo33
closed
5 months ago
1
glm4 接入dify后无法触发使用工具
#288
he498
opened
5 months ago
1
使用 streamer_v2 会造成乱码
#287
Tendo33
opened
5 months ago
2
"POST /v1/files HTTP/1.1" 404 Not Found
#286
KEAI404
closed
5 months ago
1
使用最新的 vllm 镜像推理qwen2-72B-AWQ 报错
#285
Tendo33
closed
5 months ago
4
docker无法下载image
#284
xqinshan
closed
5 months ago
1
dcoker部署embedding接口报错:"POST /v1/embeddings HTTP/1.1" 404 Not Found
#283
syusama
closed
5 months ago
2
接口请求报错:TypeError: TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]]
#282
syusama
closed
5 months ago
9
glm4用default和vllm方式部署,都不能正常的停止,流式和非流式都有同样的问题,不知道其他朋友是否遇到同样的问题
#281
LiuDQ-wm
closed
5 months ago
14
无法运行instruction.py
#280
NCCurry30
opened
5 months ago
0
vllm模式推理报错
#279
yeehua-cn
closed
5 months ago
2
我现在部署了很多模型,有没有一个webui 界面让我来统一调用部署的模型进行推理
#278
Tendo33
closed
5 months ago
3
执行SQL chat时候报ProgrammingError错误
#277
songyao199681
closed
5 months ago
2
support vllm0.4.2
#276
FreeRotate
closed
6 months ago
0
support vllm==0.4.2
#275
FreeRotate
closed
6 months ago
0
vllm本地部署时,vllm engine启动失败
#274
Ruibn
closed
6 months ago
4
什么时候能修复 Qwen 1.5 call function功能了。
#273
skyliwq
opened
6 months ago
0
如何启动接口调用大模型的流式输出 http://127.0.0.1:8080/v1/chat/completions
#272
469981325
closed
5 months ago
1
dcoker 部署 vllm 出现 404 Not Found
#271
skyliwq
closed
5 months ago
12
EMBEDDING_API_BASE获取不到str expected, not NoneType
#270
chukangkang
closed
7 months ago
5
💡 vllm已经支持流水线并行啦(pipeline parallel),可以极大增加吞吐量,作者可否增加一下vllm的pipeline parallel支持
#269
CaptainLeezz
closed
7 months ago
0
vllm 容器依赖报错
#268
Tendo33
closed
7 months ago
1
Update template.py
#267
claudegpt
closed
7 months ago
0
llama3提问后回答不停止
#266
gptcod
closed
7 months ago
2
Next