issues
search
intel
/
auto-round
Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
https://arxiv.org/abs/2309.05516
Apache License 2.0
172
stars
20
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
if the whole block is excluded from the quantization, bug will occur
#141
wenhuach21
closed
2 months ago
1
fix typos
#140
WeiweiZhang1
closed
3 months ago
0
bump version into v0.2
#139
chensuyue
closed
3 months ago
0
question about calib data
#138
mxjmtxrm
closed
1 month ago
15
Fix asym kernel issue by following autogptq's pr
#137
wenhuach21
closed
3 months ago
1
Add layer wise mode to save memory
#136
n1ck-guo
closed
1 month ago
0
support low cpu memory usage
#135
wenhuach21
closed
1 month ago
1
support trainable equivalent transformation
#134
wenhuach21
opened
3 months ago
0
support simulated MXPF4
#133
wenhuach21
closed
1 month ago
1
support activation quantization
#132
wenhuach21
closed
1 month ago
1
support multimodal models
#131
wenhuach21
closed
3 weeks ago
0
handling transformers version compatibility in lmhead export, bugfix
#130
WeiweiZhang1
closed
3 months ago
2
fix export issue with torch 2.0
#129
wenhuach21
closed
3 months ago
0
Update falcon recipe
#128
wenhuach21
closed
3 months ago
0
Adjust gpu usage based on free gpu memory space
#127
WeiweiZhang1
closed
2 weeks ago
0
fix falcon quant issue with disable_trust_remote_code
#126
WeiweiZhang1
closed
3 months ago
0
falcon 7b bug with disable_trust_remote_code
#125
wenhuach21
closed
3 months ago
1
Update phi2 recipe
#124
wenhuach21
closed
3 months ago
0
remove fp32 conversion in exporting to autogptq
#123
wenhuach21
closed
3 months ago
0
Remove unused precommit hook
#122
XuehaoSun
closed
3 months ago
0
update gemma recipe
#121
wenhuach21
closed
3 months ago
0
Fix export format issue
#120
wenhuach21
closed
3 months ago
0
Fix doc
#119
wenhuach21
closed
3 months ago
0
support `transformers.Conv1D` packing
#118
Kaihui-intel
closed
3 months ago
1
fix lm-head quant issue at disable_quanted_input
#117
wenhuach21
closed
3 months ago
0
Unexpected ppl diff
#116
YihengBrianWu
closed
2 months ago
4
support autoawq format
#115
yintong-lu
closed
1 month ago
2
support real lm-head quantization and mixed precision inference
#114
wenhuach21
closed
3 months ago
0
fix lm-head gradient accumulation bug
#113
wenhuach21
closed
3 months ago
0
update shells
#112
WeiweiZhang1
closed
3 months ago
0
chinese LLMs update and hf links
#111
yintong-lu
closed
3 months ago
0
20% speedup by removing new zero tensor
#110
wenhuach21
closed
3 months ago
0
hook AutoHfQuantizer of transformers to support different backends and mixed precision quantization
#109
wenhuach21
closed
3 months ago
1
large discrepancy between GPTQ model and qdq model
#108
wenhuach21
closed
2 months ago
1
Adjust the default eval dtype by selecting from the model_dir config
#107
WeiweiZhang1
closed
3 months ago
0
1.8X speedup by disable_low_gpu_mem_usage and reduce memory usage by avoid using torch.cat
#106
wenhuach21
closed
3 months ago
0
Consolidate dataloader&dataset_split to dataset
#105
wenhuach21
closed
4 months ago
1
Set the default scale_dtype to FP16
#104
wenhuach21
closed
3 months ago
1
cohere model support request
#103
MichoChan
closed
3 months ago
8
disable quantizing lm-head with tied weights as a workaround
#102
wenhuach21
closed
4 months ago
0
disable quantizing lm-head with tied weights as a workaround
#101
wenhuach21
closed
4 months ago
0
OPT model quantize_lm_head clarification
#100
Qubitium
closed
3 months ago
3
Merge dataloader to dataset
#99
wenhuach21
closed
4 months ago
1
update readme of calibration dataset and lm-head usage
#98
wenhuach21
closed
4 months ago
0
fix critic bug for gradient_accumulate_steps!=1 and reduce cpu memory of lm-head tuning
#97
WeiweiZhang1
closed
4 months ago
0
Add marlin and modify acc.md
#96
pursure-D
closed
3 months ago
1
fix README typos
#95
yintong-lu
closed
4 months ago
0
README typo fix
#94
yintong-lu
closed
4 months ago
0
handle invalid layername in weight_config
#93
WeiweiZhang1
closed
4 months ago
0
Lm head quant, align with main branch
#92
WeiweiZhang1
closed
4 months ago
0
Previous
Next