issues
search
intel
/
auto-round
Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
https://arxiv.org/abs/2309.05516
Apache License 2.0
172
stars
20
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
move some settings from example to main
#241
wenhuach21
closed
1 day ago
0
refine the code and the speedup is notable
#240
wenhuach21
closed
2 days ago
0
add tritonv2, improve packing and pbar
#239
wenhuach21
closed
2 days ago
0
update readme and add itrex in the requirements.txt
#238
wenhuach21
closed
6 days ago
0
enable llava int4 inference with autoround format
#237
WeiweiZhang1
opened
6 days ago
0
add brief formats introduction
#236
wenhuach21
closed
6 days ago
0
from auto_round import AutoRoundConfig
#235
CrispStrobe
opened
1 week ago
3
Serialization in multiple formats
#234
benjamin-marie
closed
1 week ago
2
promote awq format for asym quantizaiton
#233
wenhuach21
opened
1 week ago
0
fix model link
#232
WeiweiZhang1
closed
1 week ago
0
add meta3.1-70B-instruct model, refine docs
#231
WeiweiZhang1
closed
1 week ago
1
add quantized models by 3rd party
#230
WeiweiZhang1
closed
1 week ago
1
change the scale thresh generally
#229
WeiweiZhang1
closed
1 week ago
1
remove building from source
#228
wenhuach21
opened
2 weeks ago
0
support tensor parallelism
#227
wenhuach21
opened
2 weeks ago
0
refine docs, add accuracy data, add receip and eval scripts
#226
WeiweiZhang1
closed
1 week ago
0
add runable script for autoround
#225
n1ck-guo
opened
3 weeks ago
3
refine example
#224
WeiweiZhang1
closed
3 weeks ago
1
Bump setuptools from 69.5.1 to 70.0.0 in /examples/multimodal-modeling/Phi-3-vision
#223
dependabot[bot]
closed
3 weeks ago
0
[WIP] hadamard support
#222
wenhuach21
opened
3 weeks ago
1
refine eval_042 to enable parallelize evaluation
#221
WeiweiZhang1
closed
3 weeks ago
0
update readme
#220
wenhuach21
closed
3 weeks ago
0
[Low priority]add support of quarot and spinquant
#219
wenhuach21
opened
3 weeks ago
1
avoid underflow and overflow for exllamav2
#218
wenhuach21
closed
3 weeks ago
0
add qwen int4 model, refine example
#217
WeiweiZhang1
closed
3 weeks ago
0
[Low priority]auto_round:triton backend has bug at inference
#216
wenhuach21
closed
2 days ago
1
fix a bug in autoround format inference
#215
wenhuach21
closed
4 weeks ago
0
update xpu format exporting
#214
WeiweiZhang1
closed
4 weeks ago
0
remove local pile file
#213
WeiweiZhang1
closed
4 weeks ago
0
fix example dataset regression
#212
WeiweiZhang1
closed
4 weeks ago
0
limit the scale minimum value not to 0
#211
WeiweiZhang1
closed
4 weeks ago
1
xpu exporting on cpu machine
#210
wenhuach21
closed
3 weeks ago
1
Limit the scale minimum value not to 0
#209
WeiweiZhang1
closed
1 month ago
0
[Experimental Feature]fast tuning norm/bias at 2 bits
#208
wenhuach21
closed
3 weeks ago
0
Qwen2 57B scale 0 issue
#207
wenhuach21
closed
4 weeks ago
1
modify setup.py
#206
n1ck-guo
closed
4 weeks ago
0
set autoround format as default to unify CPU/HPU/CUDA
#205
wenhuach21
closed
1 month ago
0
add setseed
#204
WeiweiZhang1
closed
1 month ago
0
[query] is the int symmetric quantisation only for unsigned int?
#203
EricLiclair
opened
1 month ago
1
Remove UT coverage check
#202
XuehaoSun
closed
1 month ago
0
Add setseed in autoround
#201
WeiweiZhang1
closed
1 month ago
0
Request for Apple Metal Device Support
#200
PabloButron
opened
1 month ago
1
porting sq to autoround
#199
n1ck-guo
opened
1 month ago
1
add local file of pile-10k
#198
WeiweiZhang1
closed
4 weeks ago
0
Enable phi3v tuning
#197
WeiweiZhang1
closed
3 weeks ago
1
Ref GPTQModel for both quant and inference
#196
Qubitium
opened
1 month ago
1
bugfix of groupsize dismatch with weight shape
#195
WeiweiZhang1
closed
1 month ago
0
support low_cpu_mem at packing stage
#194
wenhuach21
opened
1 month ago
0
fix memory issue
#193
wenhuach21
closed
1 month ago
0
remove force fp16 dtype export
#192
WeiweiZhang1
closed
1 month ago
0
Next