-
Im getting the following error when using the vLLM template
An error occurred: The checkpoint you are trying to load has model type qwen2_vl but Transformers does not recognize this architecture. T…
-
## Why do you need it?
contextCache 可以大幅提升长上下文的响应速度。
可以通过 Higress AI 代理插件配置,快速提供有 RAG 能力的服务,并且可以通过 Higress 的 AI Cache 插件进一步降低成本。
## How could it be?
支持配置 cacheid,可以参考 qwen fileId 的实现
…
-
如题,我已经参考官网的手册安装了conda,且安装了两个环境,一个专门是xinference跑emding模型的,bge-large-zh-v1.5,一个是chatchat环境,我是在centos7上安装的,还没有跑成功。不清楚model_settings.yml如何配置大模型,我看官网介绍说是 qwen1.5-chat,我想问一下,是不是得在通一千文官网注册拿到api的url和appkey?这个…
-
After I started the kag service in docker, I uploaded a knowledge base file in txt format and started a conversation in the new query dialog. But no matter what question I asked, the answer returned w…
-
### Self Checks
- [X] This is only for bug report, if you would like to ask a question, please head to [Discussions](https://github.com/langgenius/dify/discussions/categories/general).
- [X] I have s…
-
### Anything you want to discuss about vllm.
when turns on speculative decoding, the openai api server failed on concurrent requests, raising error `exceptiongroup.ExceptionGroup: unhandled errors …
-
First I success install the requriment package and start the 4 server.
Then I success register the Qwen model by
` sllm-cli deploy --model Qwen2-1.5B-Instruct`
At Last, I send a request:
`
…
-
### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
### 该问题是否在FAQ中有解答? | Is there an existing ans…
-
![image](https://github.com/lm-sys/FastChat/assets/40717349/e56498e8-8fb0-49f7-ac14-0b6bc4842c05)
![image](https://github.com/lm-sys/FastChat/assets/40717349/c51c328e-a7a8-471d-b821-65d0992cbc7c)
-
### Your current environment
```text
Collecting environment information...
PyTorch version: N/A
Is debug build: N/A
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: N/A
OS: Ubuntu …