vllm Search Results - Githubissues

1000+ results
for vllm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

deepseek-ai/DeepSeek-V2 #7

How to deploy in VLLM?

ZHENG518 updated 1 month ago
10
intel-analytics/ipex-llm #11424

vLLM freezes with gpu-memory-utilization > 0.55

Running vllm according to instructions. Docker segfaults at startup, so I'm running straight on the machine. Starting server with the following shell script. As you can see I've tried to turn max…

nathanodle updated 2 weeks ago
3
modelscope/eval-scope #69

perf 测试不输出结果

环境：部署了一个基于 qwen2 72B的vllm openai api server 命令： llmuses perf --url 'http://127.0.0.1:8000/v1/chat/completions' --parallel 4 --model '/share/modelscope/hub/qwen/Qwen2-72B-Instruct-FP8' --log-eve…

hetian127 updated 3 hours ago
8
vllm-project/vllm #6288

[Usage]: Why compute_full_blocks_in_seq in block manager v1 …

### Your current environment Why in this line needs -1 ? https://github.com/vllm-project/vllm/blob/main/vllm/core/block_manager_v1.py#L667) ### How would you like to use vllm _No response_

Juelianqvq updated 8 hours ago
1
vllm-project/vllm #6207

c4ai-command-r-plus on 16GPUs

### Your current environment vllm==0.4.3 numpy==1.26.4 nvidia-nccl-cu12==2.20.5 torch==2.3.0 transformers==4.41.2 triton==2.3.0 ### 🐛 Describe the bug I don't know if this is a bug or …

thies1006 updated 7 hours ago
2
microsoft/DeepSpeed-MII #495

Configure server log level

Please add one or more params to control logs from RESTful API server - namely in `mii.serve()` function. You can see as reference `-log-` config params in vLLM: https://docs.vllm.ai/en/latest/servin…

sedletsky-f5 updated 1 week ago
2
vllm-project/vllm #6244

[Misc]: Min thread limitation inconsistency for gptq_marlin

### Anything you want to discuss about vllm. For gptq_marlin, `min_thread_n=64 min_thread_k=64` is required in [https://github.com/vllm-project/vllm/blob/70c232f85a9e83421a4d9ca95e6384364271f2bc/csrc…

HandH1998 updated 1 day ago
2
huggingface/diffusers #8641

Integrate Lumina-T2X

### Model/Pipeline/Scheduler description Lumina-T2X is a text-to-any generation model. Our model is capable of generating multiple modalities, most notably image generation. Currently, our image ge…

PommesPeter updated 2 weeks ago
4
victordibia/llmx #24

Is the openAI() function call up-to-date with latest args wh…

Hi There, I found openAI() takes base_url as the mandatory argument to initialize which is mentioned in this vLLM documentation. [https://docs.vllm.ai/en/latest/getting_started/quickstart.html#usi…

alexaios updated 6 days ago
1
sasha0552/vllm-ci #3

support for Phi3 Medium models

Hi, I have tried to load the Phi3 Medium model (128k), but it fails to work with the current version of VLLM here, is this a version update issue? and when I try the Phi3 Mini 128k, it at least tries …

rkyla updated 2 weeks ago
3

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for vllm

1000+ results
for vllm