issues
search
intel
/
auto-round
Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
https://arxiv.org/abs/2309.05516
Apache License 2.0
247
stars
20
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
add warning for no gptq exllamav2 kernel
#324
wenhuach21
opened
9 hours ago
0
Some models like Qwen overflow/downflow with CUDA kernel
#323
wenhuach21
opened
10 hours ago
1
add pile calib, rename quant_block_list to to_quant_block_names
#322
WeiweiZhang1
opened
1 day ago
2
fix multiple device bug
#321
wenhuach21
closed
13 hours ago
0
Revert "Update the check for HPU"
#320
yiliu30
closed
1 day ago
1
fix eval device issue
#319
wenhuach21
closed
1 day ago
0
Update the check for HPU
#318
yiliu30
closed
2 days ago
1
new mllm eval
#317
n1ck-guo
opened
2 days ago
0
fix merge error
#316
n1ck-guo
closed
2 days ago
1
Add cpu only version
#315
XuehaoSun
opened
3 days ago
0
add torch compile arg
#314
wenhuach21
closed
2 days ago
0
qwen2 vision quantization bugfix
#313
WeiweiZhang1
closed
6 days ago
0
multiple gpu evaluation/calibration refine
#312
wenhuach21
closed
6 days ago
0
fix typo
#311
wenhuach21
closed
1 week ago
0
Update autogptq exporting
#310
wenhuach21
closed
1 week ago
0
better align gradient_accumulate_steps for varied length input
#309
wenhuach21
closed
1 week ago
0
Add hpu version
#308
XuehaoSun
closed
1 week ago
0
Enable torch.compile on HPU
#307
yiliu30
closed
1 week ago
0
support gptq format exporting for mllm
#306
wenhuach21
closed
6 days ago
0
fix gradient accumulate issue for varying input length
#305
wenhuach21
closed
6 days ago
0
fix glm4-9b batch dim issue
#304
wenhuach21
closed
1 week ago
0
chatglm input dim bug
#303
wenhuach21
closed
1 week ago
1
HPU only release binary
#302
yiliu30
closed
3 days ago
2
Port Numba-based packing from INC
#301
yiliu30
closed
1 week ago
0
refine model config file for mixed precision quantization
#300
wenhuach21
closed
1 week ago
0
refine model config file for mixed precision quantization
#299
wenhuach21
closed
1 week ago
1
patch 1 for mllm
#298
n1ck-guo
closed
2 days ago
0
mllm eval bug fix
#297
n1ck-guo
closed
1 week ago
0
eval for MLLMs
#296
n1ck-guo
closed
1 week ago
0
use torch.compile by default for PyTorch versions 2.6 and above
#295
wenhuach21
closed
1 week ago
0
fix bug of backend
#294
wenhuach21
closed
2 weeks ago
0
fix ipex tqdm mismatch issue
#293
wenhuach21
closed
2 weeks ago
0
Add ipex support for intel cpu
#292
wenhuach21
closed
2 weeks ago
0
Refine code
#291
wenhuach21
closed
2 weeks ago
0
refine forward hook
#290
WeiweiZhang1
closed
1 week ago
2
[Enhancement] introduce ipex to support intel device
#289
wenhuach21
closed
2 weeks ago
0
[New feature request] support exporting to guff at group_size 32
#288
wenhuach21
opened
3 weeks ago
0
update torch ao integration information
#287
wenhuach21
closed
3 weeks ago
0
fix mx_fp issues
#286
wenhuach21
closed
3 weeks ago
0
avoid deterministic algorithm warning in inference
#285
wenhuach21
closed
3 weeks ago
0
update readme for cpu inference
#284
wenhuach21
closed
3 weeks ago
0
update readme for v0.3.1 release
#283
wenhuach21
closed
3 weeks ago
0
refine eval
#282
wenhuach21
closed
4 weeks ago
0
qwen2_bugfix, add adamround vision UT
#281
WeiweiZhang1
closed
4 weeks ago
0
refine AuoRound format and support marlin repacking
#280
wenhuach21
closed
3 weeks ago
0
support marlin conversion for AutoRound default exllmav2 format
#279
wenhuach21
closed
3 weeks ago
0
[Important Change]set full range sym as the default
#278
wenhuach21
closed
1 month ago
0
change to even rounding for mantissa of mx_fp
#277
wenhuach21
closed
1 month ago
0
[Experimental Feature]support for common hf multimodel
#276
n1ck-guo
closed
2 weeks ago
3
adamround bugfix, refine import
#275
WeiweiZhang1
closed
1 month ago
1
Next