issues
search
sgl-project
/
sglang
SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.
Apache License 2.0
2.74k
stars
176
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add Support to Florence-2
#572
KaifAhmad1
opened
4 hours ago
0
Update benchmark script
#571
Ying1123
closed
1 day ago
0
Why using 16 bit dtype in memory pool state?
#570
yileld
opened
1 day ago
0
Expose dtype argument
#569
merrymercy
closed
1 day ago
0
Update readme
#568
merrymercy
closed
1 day ago
0
Increase the number of thread limitation for tp worker managers.
#567
merrymercy
closed
2 days ago
0
Warmup cublas
#566
merrymercy
closed
3 days ago
0
missing 1 required positional argument: 'page_size' when using --enable-flashinfer
#565
keepitsane
closed
2 days ago
4
Add sglang.bench_latency for offline benchmark
#564
merrymercy
closed
4 days ago
0
Add a new arguments log_level_http to control the HTTP logging
#563
merrymercy
closed
4 days ago
0
[Model] Adding support for MiniCPM-Llama3-V-2_5
#562
ssuncheol
closed
4 days ago
0
Allow running with vllm==0.4.3
#561
merrymercy
closed
4 days ago
0
Update test_flashinfer
#560
hnyls2002
closed
5 days ago
0
Add LlamaForClassification
#559
merrymercy
closed
1 week ago
0
Clean up logits processor
#558
merrymercy
closed
1 week ago
0
Fix latency benchmark
#557
hnyls2002
closed
1 week ago
0
Follow-up fixes for flashinfer 0.0.5
#556
merrymercy
closed
1 week ago
0
Will speculative decoding be supported?
#555
arunpatala
closed
1 week ago
3
Update flashinfer to 0.0.5
#554
merrymercy
closed
1 week ago
0
Update fused_moe
#553
merrymercy
closed
1 week ago
0
peer access is not supported between these two devices
#552
gmonair
opened
1 week ago
1
Fix the Jump-Forward with Chinese
#551
hnyls2002
closed
1 week ago
0
Multi-node Tensor Parallelism
#550
Ying1123
closed
1 week ago
0
Chinese Regex BUG in req.jump_forward_map.jump_forward_byte
#549
wellhowtosay
closed
1 week ago
1
Trouble Shooting
#548
Ying1123
opened
2 weeks ago
0
MoE model (BDRX/Mixtral) NaN when using flashinfer
#547
Ying1123
opened
2 weeks ago
1
Fix tp worker only checking req[0] for stream
#546
Qubitium
closed
2 weeks ago
0
Update test cases
#545
ZackZeng999
opened
2 weeks ago
2
111
#544
ZackZeng999
closed
2 weeks ago
0
Llava CUDA error: device-side assert triggered
#543
dmilcevski
opened
2 weeks ago
3
Add disk cache for loading ShareGPT dataset.
#542
hnyls2002
closed
2 weeks ago
0
Clarification for wait_for_new_request_delay changes
#541
Qubitium
closed
1 week ago
1
Higher priority for user input of max_prefill_tokens & format
#540
Ying1123
closed
2 weeks ago
0
Fix dependency & crash issues
#539
Ying1123
closed
2 weeks ago
0
Fix dependency
#538
merrymercy
closed
2 weeks ago
0
M2 Mac Attempted Installation: LLVM ERROR: Option 'pbqp' already exists!
#537
velocity33
opened
2 weeks ago
0
Seems only GPU 0 is being used even when in tensor parallel across 2 GPUs
#536
aflah02
closed
2 weeks ago
4
[Bug]: Random model output using sglang backend server
#535
PanJason
opened
2 weeks ago
1
peft not found
#534
charul15
opened
2 weeks ago
0
AttributeError: module 'flashinfer' has no attribute 'batch_prefill_with_paged_kv_cache'
#533
ZackZeng999
closed
2 days ago
1
ImportError: cannot import name 'pin_program'
#532
ZackZeng999
opened
2 weeks ago
0
Fix Regression: Disable p2p for 4090
#531
ZX-ModelCloud
closed
2 weeks ago
4
Update chat template for qwen and yi-1.5.
#530
for-just-we
opened
2 weeks ago
0
Access values created within a fork
#529
brunorigal
closed
2 weeks ago
2
CUDA error: device-side assert triggered in self.forward_extend_multi_modal(batch)
#528
LetheRiver0
opened
2 weeks ago
1
SG-Lang Runtime Stuck Launching in Docker Container
#527
schopra8
opened
2 weeks ago
1
[Minor] Correct Optional type hints in api
#526
fpreiss
closed
2 weeks ago
1
Fix RAG nb, parea setup (parea -> parea-ai)
#525
fpreiss
closed
2 weeks ago
0
Fix missing numpy dependency in pyproject.toml
#524
fpreiss
closed
2 weeks ago
0
The `choices` normalised logprobs calculation returns poor results due to bias for longer-token options
#523
AidanCooper
opened
2 weeks ago
0
Next