vllm Search Results - Githubissues

1000+ results
for vllm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

OpenLLMAI/OpenRLHF #297

Avoid monkey patching vLLM

Currently, vLLM's `vllm.worker.worker.Worker` is replaced with `openrlhf.trainer.ray.vllm_worker_wrap.WorkerWrap` on fly as a monkey patch. The monkey patch is avoidable by making `init_process_gro…

Atry updated 1 month ago
1
sasha0552/vllm-ci #3

support for Phi3 Medium models

Hi, I have tried to load the Phi3 Medium model (128k), but it fails to work with the current version of VLLM here, is this a version update issue? and when I try the Phi3 Mini 128k, it at least tries …

rkyla updated 1 week ago
3
InternLM/lmdeploy #1738

[Feature] Speculative Decoding

### Motivation Speculative decoding can speed up generation more than 2x. This degree of speedup is an important feature for a production-grade LM deployment library, and it seems the methods are s…

josephrocca updated 3 weeks ago
5
modelscope/eval-scope #69

perf 测试不输出结果

环境：部署了一个基于 qwen2 72B的vllm openai api server 命令： llmuses perf --url 'http://127.0.0.1:8000/v1/chat/completions' --parallel 4 --model '/share/modelscope/hub/qwen/Qwen2-72B-Instruct-FP8' --log-eve…

hetian127 updated 18 hours ago
5
microsoft/DeepSpeed-MII #495

Configure server log level

Please add one or more params to control logs from RESTful API server - namely in `mii.serve()` function. You can see as reference `-log-` config params in vLLM: https://docs.vllm.ai/en/latest/servin…

sedletsky-f5 updated 4 days ago
2
huggingface/diffusers #8641

Integrate Lumina-T2X

### Model/Pipeline/Scheduler description Lumina-T2X is a text-to-any generation model. Our model is capable of generating multiple modalities, most notably image generation. Currently, our image ge…

PommesPeter updated 2 weeks ago
4
defenseunicorns/leapfrogai #297

feat(backends): implement observability hooks for backends

## User Story: Implement Backend Prometheus Metrics **As a** backends operator **I want** to have Prometheus metrics for observability of the vLLM backend **So that** I can monitor the performance, h…

gerred updated 4 days ago
1
victordibia/llmx #24

Is the openAI() function call up-to-date with latest args wh…

Hi There, I found openAI() takes base_url as the mandatory argument to initialize which is mentioned in this vLLM documentation. [https://docs.vllm.ai/en/latest/getting_started/quickstart.html#usi…

alexaios updated 2 days ago
1
MeetKai/functionary #198

vLLM does not support Functionary Tokenizer

When running the vLLM server for Functionary v2.5 small, the vLLM throws an error because it does not support Functionary tokenizer. I' reverted back to v2.4 for now, but thought I should bring this i…

ysiaj33802 updated 1 month ago
1
Infini-AI-Lab/Sequoia #11

The support on vLLM?

Hi, I remember the support on vLLM was on your TODOs. Have you achieved it now? Was the main challenge in this direction that the batch size > 1 tree verification is hard to made efficient? Thanks…

KexinFeng updated 2 months ago
1

上一页 1...4 5 6 7 8 9 10...100 下一页

1000+ results for vllm

1000+ results
for vllm