issues
search
vllm-project
/
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
41.53k
stars
6.27k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[Misc][Doc] Minor benchmark README update
#14874
ywang96
opened
13 minutes ago
1
[V1] Remove V0 fallback for mistral-tokenizer
#14873
ywang96
opened
1 hour ago
1
add an example script to test --load-format sharded_state
#14872
wwl2755
opened
1 hour ago
1
[Feature]Add support for sequence parallel
#14871
cascade812
opened
5 hours ago
2
[BugFix] Fix torch distributed stateless PG backend init
#14870
njhill
opened
5 hours ago
1
[Build/CI] Add SSM and Hybrid Models Test [Do Not Merge]
#14869
tlrmchlsmth
opened
5 hours ago
2
[Fix][V1][Structured Output] move back to use vocab_size from config
#14868
aarnphm
opened
5 hours ago
5
[Bugfix][V1] Fix compiled graph hash
#14867
DefTruth
opened
6 hours ago
2
[New Model]: Command A with tool support
#14866
Hexoplon
opened
6 hours ago
0
[DOC] Add Kubernetes deployment guide with CPUs
#14865
terrytangyuan
opened
8 hours ago
1
[V1] Remove input cache client
#14864
DarkLight1337
opened
9 hours ago
1
[Usage]: What should I do if I want to skip the prefill of a new request?
#14863
chenhongyu2048
opened
9 hours ago
0
[Usage]: How to use DP MLA + EP/TP MoE for online serving? I can't find any docs.
#14862
DefTruth
closed
6 hours ago
2
[Model] Enable adjusting num_hidden_layers in Llama model via hf-overrides
#14861
geekadalovelace
opened
10 hours ago
2
[CI][Intel GPU] refine intel GPU ci docker build
#14860
jikunshang
closed
11 hours ago
2
[CI/Build] Update defaults for test reproducibility and bfloat16 models
#14858
DarkLight1337
opened
15 hours ago
1
[Model] RE: Mamba2 Prefill Performance Tweaks: Fixing Flurry of Unnecessary Memory Copies
#14857
cyang49
opened
15 hours ago
2
[Installation]: How to complete the installation of the latest vllm offline through code
#14856
Wandermay
opened
16 hours ago
0
[Feature]: specify model only in config.yaml
#14855
wayzeng
opened
16 hours ago
1
[Bug]: vLLM ModelConfig doesn't pass hf_overrides to get_hf_image_processor_config, which could contain auth token for hugging face (not in ENV)
#14854
void-mckenzie
opened
17 hours ago
7
[CI/Build] Move dockerfile
#14853
jeejeelee
opened
17 hours ago
1
[Docs] Add new East Coast vLLM Meetup slides to README and meetups.md
#14852
simon-mo
closed
18 hours ago
1
[V1][Structured Output] calculate vocab_size eagerly
#14851
aarnphm
closed
18 hours ago
1
[Misc] Remove misleading message in gemma2 and gemma3
#14850
Isotr0py
closed
18 hours ago
1
[CI/Build] Delete LoRA bias test
#14849
jeejeelee
closed
18 hours ago
1
Revert "[Model] Mamba2 Prefill Performance Tweaks: Fixing Flurry of U…
#14848
tlrmchlsmth
closed
19 hours ago
1
[Misc] Catching Ray Compiled Graph PP test failures for V1
#14847
ruisearch42
opened
22 hours ago
1
[V1][TPU] Apply the ragged paged attention kernel fix and remove the padding.
#14846
vanbasten23
opened
1 day ago
1
[Bug]: TTFT Performance Regression in vLLM v0.7.0 Compared to v0.6.1.post2
#14845
asleepykitty
opened
1 day ago
1
[Bugfix] Fix torch_xla in V0 which can't handle None seed introduced …
#14844
yarongmu-google
closed
22 hours ago
1
[CI/Build] Add tests for the V1 tpu_model_runner.
#14843
yarongmu-google
opened
1 day ago
4
[Attention] Get rid of mla cache alignment
#14842
LucasWilkinson
closed
18 hours ago
4
Fix reasoning_content for chat_template include <think> tag as input
#14841
sthemeow
closed
1 day ago
1
[Build/CI] Upgrade aiohttp to incldue CVE fix
#14840
russellb
closed
1 day ago
1
[Build/CI] Upgrade jinja2 to get 3 moderate CVE fixes
#14839
russellb
closed
17 hours ago
1
[release] Remove log cleanup commands from TPU job
#14838
khluu
closed
1 day ago
1
Disable outlines cache by default
#14837
russellb
closed
19 hours ago
1
[ROCm] Integrate AITER PagedAttention
#14836
SageMoore
opened
1 day ago
4
[Build/CI] Move ninja to common deps
#14835
russellb
closed
1 day ago
1
[CI] Add TPU v1 test
#14834
richardsliu
closed
1 day ago
1
[V1] Fix model parameterization for structured output tests
#14833
russellb
closed
1 day ago
1
[V1] Entrypoints Test - Enable
#14832
robertgshaw2-redhat
opened
1 day ago
5
[CI] Add TPU V1 Test
#14831
richardsliu
closed
1 day ago
1
Add TPU V1 Test
#14830
richardsliu
closed
1 day ago
2
[Neuron][CI] update docker run command
#14829
liangfu
closed
21 hours ago
2
[Bug]: Gemma3 failing to load some weights
#14828
llajan
closed
18 hours ago
5
[Usage]: No float32 support for LoRA???
#14827
SpaceHunterInf
closed
20 hours ago
5
[V1] Fix vocab size calculation for structured output
#14826
russellb
closed
1 day ago
8
[CI/Build] Upgrade bitsandbytes
#14825
ProExpertProg
closed
1 day ago
1
[Frontend][Bugfix] support prefill decode disaggregation on deepseek
#14824
billishyahao
opened
1 day ago
3
Next