-
用 vllm 部署 Qwen2-VL-7B-Instruct,启用 prefix-caching 推理图像数据时报错 shape mismatch,prefix-caching + 纯文本数据不会报错,关闭 prefix-caching + 图像数据也不会报错
报错:
File "/opt/miniforge3/envs/vllm-qwen2-vl/lib/python3.10/site-…
-
When I replace model name with an OpenAI assistant number the chat falls over.
I note within the code for: index.d.ts > render there is a comment saying it supports OpenAI assistants.
`declare f…
-
### Your current environment
```text
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: openSUSE Tumbleweed (x86_64)
GCC v…
-
### System Info
Code based on a forked version of trl.
Package Version
--------------------------------- -------------
accelerate 0.33.0
aiohapp…
-
Subscribe to this issue and stay notified about new [daily trending repos in Rust](https://github.com/trending/rust?since=daily)!
-
### Describe the issue as clearly as possible:
When using `models.llamacpp` and creating JSON using a Pydantic model I get an error when generating the first result (see code to reproduce below). I h…
-
**Describe the bug**
When using `streamText` with `ollama` and Vercel AI SDK, there seems to be an issue where previous messages are logged, instead of only the generated response.
I've tested thi…
-
### Your current environment
The output of `python collect_env.py`
```text
Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch…
larme updated
2 months ago
-
Hey, I'm trying to use my LLM on vLLM server which is exposed as an API.
Usually, I create an openai LLM instance with Langchain like below, and it works fine.
```
import openai
from langchain.llm…
-
运行:basic_demo# python trans_web_vision_demo.py
报错如下:
![image](https://github.com/user-attachments/assets/7717c601-bc13-4a24-b4b7-411128570628)
pydantic.errors.PydanticSchemaGenerationError: Unabl…