vllm Search Results - Githubissues

1000+ results
for vllm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

vllm-project/vllm #5805

[Roadmap] vLLM Roadmap Q3 2024

### Anything you want to discuss about vllm. This document includes the features in vLLM's roadmap for Q3 2024. Please feel free to discuss and contribute, as this roadmap is shaped by the vLLM com…

simon-mo updated 1 day ago
21
langfuse/langfuse #1582

feat(langchain): support vLLM model name

### Describe the bug Custom model name is not picked up. WARNING:langfuse:Langfuse was not able to parse the LLM model. The LLM call will be recorded without model name. Please create an issue so we…

l4b4r4b4b4 updated 4 days ago
4
FoundationVision/LlamaGen #3

ModuleNotFoundError: No module named 'vllm.engine.ray_utils'

ModuleNotFoundError: No module named 'vllm.engine.ray_utils' Please tell me vllm version，thanks

csdY123 updated 1 month ago
1
MeetKai/functionary #213

Feature Request: Support for Additional vLLM Configuration S…

## Description I would like to inquire if there are any plans to support more configuration settings for vLLM, specifically related to RoPE scaling and theta adjustments. ## Background vLLM curre…

Luffyzm3D2Y updated 3 weeks ago
7
NVIDIA/TensorRT-LLM #1873

tensorrt-llm llama3 slower then vllm(4bit quant)?

### System Info - nvidia：535.129.03 - cuda_version:12.4 - GPU:L40S - OS：Ubuntu 22.04.4 LTS（docker） - tensorrt-llm: 0.11.0.dev2024060400 ### Who can help? _No response_ ### Information …

bleedingfight updated 3 weeks ago
2
mit-han-lab/qserve #12

Any performance comparsion with vllm?

as title

MuYu-zhi updated 2 months ago
1
tonyctalope/gpu_poor #1

Up to date GPT_POOR

### Add newest GPUs cards: - h100 - h200? - a100 - l40s ### Modify Huggingface configuration handling: - Instead of storing the Huggingface config locally, gather them from an API call. ###…

tonyctalope updated 2 days ago
3
deepseek-ai/DeepSeek-Coder-V2 #19

When will the vllm PR be merged to the main branch?

Thank you for your impressive work on this project. I'm eager to try this model, but I've noticed that the `vllm` deployment [pull request](https://github.com/vllm-project/vllm/pull/4650) has conflict…

zuxin666 updated 1 day ago
8
QwenLM/Qwen2 #513

Qwen2 get stuck in a infinite loop when working with vLLM

When I use Qwen 2.0 72B Chat AWQ with latest vLLM, after the client initiates a OpenAI compatible request, there is a probability that the model will get stuck in an infinite loop, continuously consum…

kenvix updated 2 weeks ago
13
open-telemetry/semantic-conventions #1079

Do we need to distinguish client side and server side llm ca…

### Area(s) area:gen-ai ### What happened? ## Description There is a PR trying to enable vllm support metrics as well and it is adopting this semantic convention as well https://github.com…

gyliu513 updated 2 weeks ago
2

上一页 1...9 10 11 12 13 14 15...100 下一页

1000+ results for vllm

1000+ results
for vllm