issues
search
mit-han-lab
/
llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
MIT License
2.38k
stars
184
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
can awq support 3-bit,2-bit, 8-bit quantization?
#172
ArlanCooper
opened
5 months ago
2
Grok-1 AWQ
#171
jjovalle99
opened
6 months ago
0
illegal memory access when input tokens < 8
#170
casper-hansen
opened
6 months ago
0
Weight Packing Format
#169
jeromeku
opened
6 months ago
0
[Minor] Update README.
#168
ys-2020
closed
6 months ago
0
when q-group-size = -1,the code will not run
#167
ZhengHSI
opened
6 months ago
0
Inquiry about Minimum GPU Requirements
#166
loiqy
opened
6 months ago
1
Out of memory in Jetson Orin NX 8GB
#165
pprp
opened
6 months ago
0
AWQ for non-transformer layers?
#164
satabios
opened
6 months ago
0
Possible Bug in "_search_module_scale" Function
#163
satabios
opened
6 months ago
0
Weight int4 quantization, but actually it is int16
#162
dongxuemin666
opened
6 months ago
4
SM_75 (Turing) Support
#161
28Smiles
opened
6 months ago
0
Error while generating real quantized weights for VILA
#160
ocg2347
opened
6 months ago
0
KeyError: 'llava_llama'
#159
huzicong
opened
6 months ago
1
run_awq.<locals>.Catcher.forward() error
#158
chunniunai220ml
opened
6 months ago
0
Use awq to quantize Deepseek-coder-33B-instruct model
#157
CarolXh
opened
6 months ago
0
Update pyproject.toml
#156
Louym
closed
6 months ago
2
RuntimeError: Unknown Layout in CUDA Kernel Execution
#155
fantasysee
opened
6 months ago
0
reproduce Llama2 7b failure : RuntimeError: The expanded size of the tensor (4608) must match the existing size (4096) at non-singleton dimension 3. Target sizes: [65, 32, 512, 4608]. Tensor sizes: [65, 1, 512, 4096]
#154
tuanhe
opened
6 months ago
3
awq_inference_engine has no attribute 'gemm_forward_cuda_new'
#153
pribadihcr
opened
6 months ago
4
Llava weight
#152
zhoujian-z
opened
7 months ago
0
support video frames.
#151
Lyken17
opened
7 months ago
0
Update TinyChat Speed for VILA
#150
ys-2020
closed
6 months ago
0
[Minor] Update news.
#149
ys-2020
closed
7 months ago
0
Update news
#148
kentang-mit
closed
7 months ago
0
`RuntimeError: probability tensor contains either `inf`, `nan` or element < 0` when running LLaVA demo
#147
isaac-vidas
opened
7 months ago
2
Add more quantization example scripts & update readme support matrix.
#146
ys-2020
closed
7 months ago
0
Fix memory issue when running `run_awq`
#145
isaac-vidas
closed
7 months ago
2
CUDA out of memory when trying to run AWQ search on A100-80GB
#144
isaac-vidas
closed
7 months ago
10
Gradio demo update
#143
ys-2020
closed
7 months ago
0
Add new backends & support multi-modal LM
#142
ys-2020
closed
7 months ago
1
Reformat the codebase with black.
#141
ys-2020
closed
7 months ago
0
Dead repo?
#140
guspuffygit
closed
6 months ago
1
AWQ is shutting down the server
#139
Chirobocea
closed
3 months ago
0
Support for adept/fuyu-8b?
#138
SinanAkkoyun
opened
8 months ago
0
Latency Improve
#137
kyrie2to11
opened
8 months ago
1
Some question about example
#136
YIHUASHAO
opened
8 months ago
0
Integrate & optimize LLaVA in TinyChat
#135
ys-2020
closed
7 months ago
1
Integrate & optimize LLaVA in TinyChat
#134
ys-2020
closed
8 months ago
0
2 bit AWQ results?
#133
tsengalb99
opened
9 months ago
0
why can only protect one salient channel per group?
#132
StudyingShao
closed
8 months ago
1
where is awq_inference_engine of "llm-awq/awq/quantize /qmodule.py"
#131
CXiaorong
opened
9 months ago
1
AWQ and SmoothQuant
#130
DavidePaglieri
opened
9 months ago
2
4bit awq backwarding
#129
huangyuxiang03
opened
10 months ago
0
performance for prefill on long sequence
#128
frankxyy
opened
10 months ago
0
difference from gptq when inferring
#127
frankxyy
opened
10 months ago
6
inference speed
#126
frankxyy
opened
10 months ago
1
Update README
#125
Sakits
closed
10 months ago
0
can not install awq CUDA kernels
#124
ycyaoxdu
opened
10 months ago
4
OpenCL support
#123
leviathanch
opened
10 months ago
0
Previous
Next