sglang Search Results - Githubissues

776 results
for sglang

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

flashinfer-ai/flashinfer #530

Support vLLM-style rope

As part of [SGLang Issue #1487](https://github.com/sgl-project/sglang/issues/1487), SGLang plans to move vLLM to optional dependencies and use flashinfer as the main dependency. I am working on mo…

ByronHsu updated 2 weeks ago
4
vllm-project/vllm #6929

why our performance so low when compare with sglang(https://…

### Proposal to improve performance _No response_ ### Report of performance regression _No response_ ### Misc discussion on performance _No response_ ### Your current environment (if you think i…

lw921014 updated 4 days ago
2
mneedham/LearnDataWithMark #41

[Content] - sglang

https://github.com/sgl-project/sglang

mneedham updated 8 months ago
1
ModelCloud/GPTQModel #329

[FEATURE] Add `dynamic` suppor for AutoRound quantiztion

@wenhuach21 GPTQModel has merged `dynamic` per layer/module control of quantization but I don't think auto-round currently supports such per layer/module control during quantization. I know this is s…

Qubitium updated 1 week ago
3
sgl-project/sglang #1874

Benchmark torchao and torch.compile (need torch 2.5)

vllm updated to use pytorch 2.5 recently, so we can benchmark torchao with torch.compile now (previously blocked by 2.5 update) 1. install most recent vllm: `pip install https://vllm-wheels.s3.us…

jerryzh168 updated 2 days ago
2
AnjieCheng/SpatialRGPT #3

Code for LLM-based Complex Reasoning Question-Answer generat…

Hi, Thank you for sharing your great work. Is there any plan to release the code for generating LLM-based Complex Reasoning Question-Answer? It seems there is no code for it. I really apprecia…

ikodoh updated 2 weeks ago
1
vllm-project/vllm #7494

[Bug]: DeepSeek-Coder-V2-Instruct-AWQ assert self.quant_m…

### Your current environment The output of `python collect_env.py` ```text ```text ollecting environment information... PyTorch version: 2.3.1+cu121 Is debug build: False CUDA used to build…

fengyang95 updated 6 days ago
12
stikkireddy/mlflow-extensions #19

[RFC] Batch inference using ez_deploy_config

* the input should be a delta table with specific schema (for idempotency and recomputing inference if you change model, etc) * the output will be a column called "predictions" (user definable) and a…

stikkireddy updated 1 month ago
2
pytorch/ao #992

Create a quant_utils file to reduce code duplication in eval…

some duplication in https://github.com/pytorch/ao/blob/378e6a8d6854d77efba45fcb1a4091724e9cfaa9/torchao/_models/llama/generate.py#L215-L267 and https://github.com/pytorch/ao/blob/378e6a8d6854d77efba45…

jerryzh168 updated 1 month ago
1
sgl-project/sglang #1729

[Bug][minimal reproducible demo] High variability across bat…

### Checklist - [X] 1. I have searched related issues but cannot get the expected help. - [X] 2. The bug has not been fixed in the latest version. - [X] 3. Please note that if the bug-related iss…

FredericOdermatt updated 3 days ago
5

上一页 1...1 2 3 4 5 6 7...78 下一页

776 results for sglang

776 results
for sglang