mit-han-lab llm-awq issues

mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

MIT License

2.38k stars 184 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

can awq support 3-bit,2-bit, 8-bit quantization?

#172 ArlanCooper opened 5 months ago
2
Grok-1 AWQ

#171 jjovalle99 opened 6 months ago
0
illegal memory access when input tokens < 8

#170 casper-hansen opened 6 months ago
0
Weight Packing Format

#169 jeromeku opened 6 months ago
0
[Minor] Update README.

#168 ys-2020 closed 6 months ago
0
when q-group-size = -1,the code will not run

#167 ZhengHSI opened 6 months ago
0
Inquiry about Minimum GPU Requirements

#166 loiqy opened 6 months ago
1
Out of memory in Jetson Orin NX 8GB

#165 pprp opened 6 months ago
0
AWQ for non-transformer layers?

#164 satabios opened 6 months ago
0
Possible Bug in "_search_module_scale" Function

#163 satabios opened 6 months ago
0
Weight int4 quantization, but actually it is int16

#162 dongxuemin666 opened 6 months ago
4
SM_75 (Turing) Support

#161 28Smiles opened 6 months ago
0
Error while generating real quantized weights for VILA

#160 ocg2347 opened 6 months ago
0
KeyError: 'llava_llama'

#159 huzicong opened 6 months ago
1
run_awq.<locals>.Catcher.forward() error

#158 chunniunai220ml opened 6 months ago
0
Use awq to quantize Deepseek-coder-33B-instruct model

#157 CarolXh opened 6 months ago
0
Update pyproject.toml

#156 Louym closed 6 months ago
2
RuntimeError: Unknown Layout in CUDA Kernel Execution

#155 fantasysee opened 6 months ago
0
reproduce Llama2 7b failure : RuntimeError: The expanded size of the tensor (4608) must match the existing size (4096) at non-singleton dimension 3. Target sizes: [65, 32, 512, 4608]. Tensor sizes: [65, 1, 512, 4096]

#154 tuanhe opened 6 months ago
3
awq_inference_engine has no attribute 'gemm_forward_cuda_new'

#153 pribadihcr opened 6 months ago
4
Llava weight

#152 zhoujian-z opened 7 months ago
0
support video frames.

#151 Lyken17 opened 7 months ago
0
Update TinyChat Speed for VILA

#150 ys-2020 closed 6 months ago
0
[Minor] Update news.

#149 ys-2020 closed 7 months ago
0
Update news

#148 kentang-mit closed 7 months ago
0
`RuntimeError: probability tensor contains either `inf`, `nan` or element < 0` when running LLaVA demo

#147 isaac-vidas opened 7 months ago
2
Add more quantization example scripts & update readme support matrix.

#146 ys-2020 closed 7 months ago
0
Fix memory issue when running `run_awq`

#145 isaac-vidas closed 7 months ago
2
CUDA out of memory when trying to run AWQ search on A100-80GB

#144 isaac-vidas closed 7 months ago
10
Gradio demo update

#143 ys-2020 closed 7 months ago
0
Add new backends & support multi-modal LM

#142 ys-2020 closed 7 months ago
1
Reformat the codebase with black.

#141 ys-2020 closed 7 months ago
0
Dead repo?

#140 guspuffygit closed 6 months ago
1
AWQ is shutting down the server

#139 Chirobocea closed 3 months ago
0
Support for adept/fuyu-8b?

#138 SinanAkkoyun opened 8 months ago
0
Latency Improve

#137 kyrie2to11 opened 8 months ago
1
Some question about example

#136 YIHUASHAO opened 8 months ago
0
Integrate & optimize LLaVA in TinyChat

#135 ys-2020 closed 7 months ago
1
Integrate & optimize LLaVA in TinyChat

#134 ys-2020 closed 8 months ago
0
2 bit AWQ results?

#133 tsengalb99 opened 9 months ago
0
why can only protect one salient channel per group?

#132 StudyingShao closed 8 months ago
1
where is awq_inference_engine of "llm-awq/awq/quantize /qmodule.py"

#131 CXiaorong opened 9 months ago
1
AWQ and SmoothQuant

#130 DavidePaglieri opened 9 months ago
2
4bit awq backwarding

#129 huangyuxiang03 opened 10 months ago
0
performance for prefill on long sequence

#128 frankxyy opened 10 months ago
0
difference from gptq when inferring

#127 frankxyy opened 10 months ago
6
inference speed

#126 frankxyy opened 10 months ago
1
Update README

#125 Sakits closed 10 months ago
0
can not install awq CUDA kernels

#124 ycyaoxdu opened 10 months ago
4
OpenCL support

#123 leviathanch opened 10 months ago
0

Previous Next