issues
search
intel
/
auto-round
Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
https://arxiv.org/abs/2309.05516
Apache License 2.0
132
stars
18
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Fix exlllamav2 backend issue
#144
wenhuach21
closed
1 month ago
0
refine the code
#143
wenhuach21
closed
1 month ago
0
fix exporting typo
#142
yintong-lu
closed
1 month ago
1
if the whole block is excluded from the quantization, bug will occur
#141
wenhuach21
closed
1 month ago
1
fix typos
#140
WeiweiZhang1
closed
1 month ago
0
bump version into v0.2
#139
chensuyue
closed
1 month ago
0
question about calib data
#138
mxjmtxrm
opened
1 month ago
14
Fix asym kernel issue by following autogptq's pr
#137
wenhuach21
closed
1 month ago
1
Add layer wise mode to save memory
#136
n1ck-guo
closed
1 week ago
0
support low cpu memory usage
#135
wenhuach21
closed
5 days ago
1
support trainable equivalent transformation
#134
wenhuach21
opened
1 month ago
0
support simulated MXPF4
#133
wenhuach21
closed
2 days ago
1
support activation quantization
#132
wenhuach21
closed
5 days ago
1
support multimodal models
#131
wenhuach21
opened
1 month ago
0
handling transformers version compatibility in lmhead export, bugfix
#130
WeiweiZhang1
closed
1 month ago
2
fix export issue with torch 2.0
#129
wenhuach21
closed
1 month ago
0
Update falcon recipe
#128
wenhuach21
closed
1 month ago
0
Adjust gpu usage based on free gpu memory space
#127
WeiweiZhang1
opened
1 month ago
0
fix falcon quant issue with disable_trust_remote_code
#126
WeiweiZhang1
closed
1 month ago
0
falcon 7b bug with disable_trust_remote_code
#125
wenhuach21
closed
1 month ago
1
Update phi2 recipe
#124
wenhuach21
closed
1 month ago
0
remove fp32 conversion in exporting to autogptq
#123
wenhuach21
closed
1 month ago
0
Remove unused precommit hook
#122
XuehaoSun
closed
1 month ago
0
update gemma recipe
#121
wenhuach21
closed
1 month ago
0
Fix export format issue
#120
wenhuach21
closed
1 month ago
0
Fix doc
#119
wenhuach21
closed
1 month ago
0
support `transformers.Conv1D` packing
#118
Kaihui-intel
closed
1 month ago
1
fix lm-head quant issue at disable_quanted_input
#117
wenhuach21
closed
1 month ago
0
Unexpected ppl diff
#116
YihengBrianWu
closed
3 weeks ago
4
support autoawq format
#115
yintong-lu
opened
1 month ago
1
support real lm-head quantization and mixed precision inference
#114
wenhuach21
closed
1 month ago
0
fix lm-head gradient accumulation bug
#113
wenhuach21
closed
2 months ago
0
update shells
#112
WeiweiZhang1
closed
2 months ago
0
chinese LLMs update and hf links
#111
yintong-lu
closed
2 months ago
0
20% speedup by removing new zero tensor
#110
wenhuach21
closed
2 months ago
0
hook AutoHfQuantizer of transformers to support different backends and mixed precision quantization
#109
wenhuach21
closed
1 month ago
1
large discrepancy between GPTQ model and qdq model
#108
wenhuach21
closed
1 month ago
1
Adjust the default eval dtype by selecting from the model_dir config
#107
WeiweiZhang1
closed
2 months ago
0
1.8X speedup by disable_low_gpu_mem_usage and reduce memory usage by avoid using torch.cat
#106
wenhuach21
closed
2 months ago
0
Consolidate dataloader&dataset_split to dataset
#105
wenhuach21
closed
2 months ago
1
Set the default scale_dtype to FP16
#104
wenhuach21
closed
2 months ago
1
cohere model support request
#103
MichoChan
closed
2 months ago
8
disable quantizing lm-head with tied weights as a workaround
#102
wenhuach21
closed
2 months ago
0
disable quantizing lm-head with tied weights as a workaround
#101
wenhuach21
closed
2 months ago
0
OPT model quantize_lm_head clarification
#100
Qubitium
closed
1 month ago
3
Merge dataloader to dataset
#99
wenhuach21
closed
2 months ago
1
update readme of calibration dataset and lm-head usage
#98
wenhuach21
closed
2 months ago
0
fix critic bug for gradient_accumulate_steps!=1 and reduce cpu memory of lm-head tuning
#97
WeiweiZhang1
closed
2 months ago
0
Add marlin and modify acc.md
#96
pursure-D
closed
1 month ago
1
fix README typos
#95
yintong-lu
closed
2 months ago
0
Previous
Next