issues
search
intel
/
auto-round
Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
https://arxiv.org/abs/2309.05516
Apache License 2.0
131
stars
17
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Reminder to install auto-gptq/itrex before quantization in code/readme
#178
wenhuach21
opened
18 hours ago
0
AutoRound/Examples dependency is injected to AutoRound pkg dependency
#177
Qubitium
opened
1 day ago
1
add initial support for activation quantization
#176
wenhuach21
closed
3 days ago
0
speedup the tuning a little
#175
wenhuach21
closed
4 days ago
0
[Large impact]set the default nsamples to 128 and low_gpu_mem_usage to False
#174
wenhuach21
closed
4 days ago
0
Add unit test
#173
XuehaoSun
opened
4 days ago
0
support marlin in auto_round format
#172
wenhuach21
closed
4 days ago
0
add chat template in calib tokenization
#171
yintong-lu
closed
4 days ago
0
support enable_fast_quant
#170
wenhuach21
opened
6 days ago
0
Marlin
#169
wenhuach21
closed
6 days ago
0
revert the gptq format code to fix the regression
#168
wenhuach21
closed
6 days ago
0
[pre-commit.ci] pre-commit autoupdate
#167
pre-commit-ci[bot]
opened
6 days ago
0
fix typos, update overview img
#166
WeiweiZhang1
closed
1 week ago
0
enable llava & Qwen-VL multimodal model quantization
#165
WeiweiZhang1
opened
1 week ago
0
for test
#164
chensuyue
closed
1 week ago
0
1 fix a bug in autoround format with the latest transformers 2 rename n_samples n_blocks to nsamples nblocks
#163
wenhuach21
closed
1 week ago
0
Combine TEQ and Autoround
#162
yiliu30
opened
1 week ago
0
RuntimeError: Expected is_sm80 || is_sm90 to be true, but got false.
#161
RakshitAralimatti
opened
2 weeks ago
4
example_bugfix
#160
WeiweiZhang1
closed
2 weeks ago
1
fix bug and limit numpy version
#159
yintong-lu
closed
2 weeks ago
0
remove gptq ppl eval from lm-0.4.2
#158
wenhuach21
closed
3 weeks ago
0
AutoRound currently does not support numpy2x.
#157
Kaihui-intel
opened
3 weeks ago
0
fix bug at whole block is excluded from quantization
#156
wenhuach21
closed
3 weeks ago
0
auto round quantizer supports gptq kernel
#155
wenhuach21
closed
3 weeks ago
0
fix
#154
zhewang1-intc
closed
3 weeks ago
0
fix qbits issue
#153
wenhuach21
closed
3 weeks ago
0
Qbits lm-eval incorrect behaviour
#152
wenhuach21
closed
3 weeks ago
3
Qbits related log
#151
zhewang1-intc
closed
1 month ago
1
[large impact]set auto_round format as default
#150
wenhuach21
opened
1 month ago
0
fix incorrect setting for lm-head
#149
wenhuach21
closed
1 month ago
0
fix triton issue
#148
wenhuach21
closed
1 month ago
0
support calibration dataset concat
#147
yintong-lu
closed
2 weeks ago
11
Add trainable equivalent transformation
#146
yiliu30
opened
1 month ago
0
autoround_support_qbits_backend
#145
zhewang1-intc
closed
1 month ago
4
Fix exlllamav2 backend issue
#144
wenhuach21
closed
1 month ago
0
refine the code
#143
wenhuach21
closed
1 month ago
0
fix exporting typo
#142
yintong-lu
closed
1 month ago
1
if the whole block is excluded from the quantization, bug will occur
#141
wenhuach21
closed
3 weeks ago
1
fix typos
#140
WeiweiZhang1
closed
1 month ago
0
bump version into v0.2
#139
chensuyue
closed
1 month ago
0
question about calib data
#138
mxjmtxrm
opened
1 month ago
14
Fix asym kernel issue by following autogptq's pr
#137
wenhuach21
closed
1 month ago
1
Add layer wise mode to save memory
#136
n1ck-guo
opened
1 month ago
0
support low cpu memory usage
#135
wenhuach21
opened
1 month ago
0
support trainable equivalent transformation
#134
wenhuach21
opened
1 month ago
0
support simulated MXPF4
#133
wenhuach21
opened
1 month ago
0
support activation quantization
#132
wenhuach21
opened
1 month ago
0
support multimodal models
#131
wenhuach21
opened
1 month ago
0
handling transformers version compatibility in lmhead export, bugfix
#130
WeiweiZhang1
closed
1 month ago
2
fix export issue with torch 2.0
#129
wenhuach21
closed
1 month ago
0
Next