-
Hi all, thanks to the community effort, LLamaSharp has had much richer features than the beginning. Meanwhile, the distributions of backend package may be changed soon. Therefore I think it's time to …
-
### Reproduction
```
// app/routes/api.chat.ts
import type {ActionFunctionArgs} from '@vercel/remix';
import {createOpenAI} from '@ai-sdk/openai';
import {streamText} from 'ai';
export con…
-
### 🐛 Describe the bug
when I load model with AutoLigerKernelForCausalLM ,I get ValueError: Pointer argument (at 0) cannot be accessed from Triton (cpu tensor?)
when load mdoel Apply Model…
-
目前没有找到相关方法和相关文档,可否提供下,谢谢
-
### Checklist
- [X] 1. I have searched related issues but cannot get the expected help.
- [X] 2. The bug has not been fixed in the latest version.
### Describe the bug
使用llava-v1.6-34b模型部署OPENAI 服务…
-
第一次使用,能接入通义千问吗?文档没看到api示例
https://docs.deepwisdom.ai/main/zh/guide/get_started/configuration/llm_api_configuration.html#%E7%99%BE%E5%BA%A6-%E5%8D%83%E5%B8%86-api
-
### Motivation.
Currently models like `llava-hf/llava-next-video*` recognize image and video inputs with different tokens, and do different computations. Therefore vLLM should provide new APIs and …
-
### What is the issue?
When using http api/generate and stream=False, HTTP returns a 500 error code within a fixed 1-minute period
### OS
Linux
### GPU
Other
### CPU
AMD
### Ollama version
0.…
-
这是我的运行代码:
python -m vllm.entrypoints.openai.api_server --served-model-name Qwen2-VL-7B-Instruct --model /home/wangll/llm/model_download_demo/models/Qwen/Qwen2-VL-7B-Instruct
以下是报错信息:
INFO 09-03 1…
-
## 🐛 Bug
## To Reproduce
Steps to reproduce the behavior:
1. Install latest mlc-llm and mlc-ai in conda with python 3.12, running on an Apple Silicon (M1 Pro) MacBook Pro with 32 GiB of R…