awq Search Results - Githubissues

1000+ results
for awq

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

vllm-project/vllm #9863

[Performance]: Qwen2-VL-7B AWQ model performance

### Proposal to improve performance Hi~ I find the inference time of Qwen2-VL-7B AWQ is not improved too much compared to Qwen2-VL-7B. Do you have any suggestions about improving performance. Thank y…

zzf2grx updated 3 weeks ago
5
sgl-project/sglang #1991

[Feature] Are there plans to support AWQ and torch compile?

### Checklist - [X] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.…

sitabulaixizawaluduo updated 2 weeks ago
3
smpanaro/norm-tweaking #2

Does it support AWQ quantization？

LiMa-cas updated 3 months ago
1
LDLINGLINGLING/adan_application #7

MiniCPM-V_2_6_awq_int4显存占用20G

git clone https://www.modelscope.cn/models/linglingdan/MiniCPM-V_2_6_awq_int4 用这个量化后的INT4模型推理，显存占用大概20G，和fp模型显存占用情况基本一样，请教下是不是量化存在问题？

smilebetterworld updated 1 week ago
1
vllm-project/vllm #9913

[Bug]: awq marlin error for deepseek v2 lite

### Your current environment vllm==0.6.3.post1 ### Model Input Dumps ```bash ValueError: Weight input_size_per_partition = 10944 is not divisible by min_thread_k = 128. Consider reducing tensor_pa…

TechxGenus updated 1 week ago
1
pytorch/ao #530

Add AWQ support

[AWQ](https://arxiv.org/pdf/2306.00978) seems popular: 3000 appearances in huggingface models: (https://huggingface.co/models?sort=trending&search=AWQ), similar to GPTQ. Maybe we can add this to torch…

jerryzh168 updated 4 months ago
6
QwenLM/Qwen2.5 #1059

[Bug]: 使用Qwen2.5-72B-Instruct-AWQ和Qwen2.5-32B-Instruct-AWQ官方…

### Model Series Qwen2.5 ### What are the models used? Qwen2.5-72B-Instruct-AWQ和Qwen2.5-32B-Instruct-AWQ ### What is the scenario where the problem happened? inference with transformers ### Is …

SuSuStarSmile updated 2 weeks ago
1
mit-han-lab/llm-awq #130

AWQ and SmoothQuant

Hi, first of all congrats for the great work! I wanted to ask why isn't there a more thorough comparison between AWQ and SmoothQuant in the paper. To my understanding, they both work using a simila…

DavidePaglieri updated 1 month ago
3
unslothai/unsloth #1310

what was the quantisation algorithm used in unsloth/Llama-3.…

what was the quantisation algorithm used in unsloth/Llama-3.2-1B-bnb-4bit model: https://huggingface.co/docs/transformers/main/en/quantization/overview. Is it int4_awq or int4_weightonly ?

jayakommuru updated 6 days ago
1
sgl-project/sglang #1921

[Bug] Make multi-lora serving compatible with cuda graph and…

### Checklist - [X] 1. I have searched related issues but cannot get the expected help. - [X] 2. The bug has not been fixed in the latest version. - [X] 3. Please note that if the bug-related issue y…

LIUKAI0815 updated 1 week ago
2

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for awq

1000+ results
for awq