sglang Search Results - Githubissues

776 results
for sglang

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

sgl-project/sglang #252

No module named 'vllm.transformers_utils.configs.qwen'

After having successfully deployed some models using this DockerFile: ``` FROM python:3.11 # It's good practice to update pip to ensure we can handle recent package specifications RUN pip inst…

SimoneSartoni updated 7 months ago
5
xorbitsai/inference #2339

0.15.1版本Qwen2-7b-instruct 推理报错！

### System Info / 系統信息 CUDA==12.1 transformers == 4.44.2 llama_cpp_python == 0.2.90 vllm == 0.6.1.post2 vllm-flash-attn == 2.6.1 Python==3.10.14 Ubuntu==24.04 ### Running Xinference wit…

PiPiNam updated 4 weeks ago
4
sgl-project/sglang #317

`--enable-flashinfer` fails

When I turn on flashinfer, it reports the following error: ``` Exception in ModelRpcClient: Traceback (most recent call last): File "/home/wst4sgh/playground/sine/.venv/lib/python3.10/site-packa…

s7ev3n updated 7 months ago
1
vercel/ai #1061

Extend ai/rsc render function to handle additional parameter…

### Feature Description Right now the `render` method has the following function signature: ``` /** * The model name to use. Must be OpenAI SDK compatible. Tools and Functions are only suppor…

fozziethebeat updated 5 months ago
1
vllm-project/vllm #5436

[Bug]: get the degree of the `outlines FSM` compilation prog…

### Your current environment ```text PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 20.04.5 LTS (x86_64) GCC ve…

syGOAT updated 3 months ago
17
NVIDIA/TensorRT-LLM #1118

LLAVA is slow due to unnecessary output tokens

### System Info - H100 ### Who can help? @kaiy ### Information - [X] The official example scripts - [ ] My own modified scripts ### Tasks - [ ] An officially supported task in the `examples` fo…

Gutianpei updated 8 months ago
18
xorbitsai/inference #2412

集成dify的时候tts出现问题

### System Info / 系統信息 cuda和系统 ![image](https://github.com/user-attachments/assets/b36b052b-6c80-4d15-8b83-b078762466e8) python环境 ``` Package Version ------------------…

Uhao-P updated 3 weeks ago
1
sgl-project/sglang #283

[BUG] Flashinfer 0.0.3 compat with Sglang

Using flashinfer 0.0.3 requires one line change #282 but there is a compat issue where same model runs fine on 0.0.2 but under 0.0.3 throws an infinite loop of the following on sglang: ``` Excepti…

Qubitium updated 7 months ago
4
sgl-project/sglang #289

[BUG] Marlin model quantized with AutoGPTQ is not loadable

@qeternity In PR #286, Marlin kernel is merged but when is it actually used? I have tested a marlin llama2 model (works on vllm) but not on latest sglang tip. ``` Traceback (most recent call l…

Qubitium updated 7 months ago
3
InternLM/lmdeploy #2005

about LMDeploy delivers up to 1.8x higher request throughput…

which version of vllm do you use? and does the vllm use of cudagraph?

tricky61 updated 3 months ago
11

上一页 1...65 66 67 68 69 70 71...78 下一页

776 results for sglang

776 results
for sglang