intel auto-round issues

intel / auto-round

Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"

https://arxiv.org/abs/2309.05516

Apache License 2.0

172 stars 20 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

if the whole block is excluded from the quantization, bug will occur

#141 wenhuach21 closed 2 months ago
1
fix typos

#140 WeiweiZhang1 closed 3 months ago
0
bump version into v0.2

#139 chensuyue closed 3 months ago
0
question about calib data

#138 mxjmtxrm closed 1 month ago
15
Fix asym kernel issue by following autogptq's pr

#137 wenhuach21 closed 3 months ago
1
Add layer wise mode to save memory

#136 n1ck-guo closed 1 month ago
0
support low cpu memory usage

#135 wenhuach21 closed 1 month ago
1
support trainable equivalent transformation

#134 wenhuach21 opened 3 months ago
0
support simulated MXPF4

#133 wenhuach21 closed 1 month ago
1
support activation quantization

#132 wenhuach21 closed 1 month ago
1
support multimodal models

#131 wenhuach21 closed 3 weeks ago
0
handling transformers version compatibility in lmhead export, bugfix

#130 WeiweiZhang1 closed 3 months ago
2
fix export issue with torch 2.0

#129 wenhuach21 closed 3 months ago
0
Update falcon recipe

#128 wenhuach21 closed 3 months ago
0
Adjust gpu usage based on free gpu memory space

#127 WeiweiZhang1 closed 2 weeks ago
0
fix falcon quant issue with disable_trust_remote_code

#126 WeiweiZhang1 closed 3 months ago
0
falcon 7b bug with disable_trust_remote_code

#125 wenhuach21 closed 3 months ago
1
Update phi2 recipe

#124 wenhuach21 closed 3 months ago
0
remove fp32 conversion in exporting to autogptq

#123 wenhuach21 closed 3 months ago
0
Remove unused precommit hook

#122 XuehaoSun closed 3 months ago
0
update gemma recipe

#121 wenhuach21 closed 3 months ago
0
Fix export format issue

#120 wenhuach21 closed 3 months ago
0
Fix doc

#119 wenhuach21 closed 3 months ago
0
support `transformers.Conv1D` packing

#118 Kaihui-intel closed 3 months ago
1
fix lm-head quant issue at disable_quanted_input

#117 wenhuach21 closed 3 months ago
0
Unexpected ppl diff

#116 YihengBrianWu closed 2 months ago
4
support autoawq format

#115 yintong-lu closed 1 month ago
2
support real lm-head quantization and mixed precision inference

#114 wenhuach21 closed 3 months ago
0
fix lm-head gradient accumulation bug

#113 wenhuach21 closed 3 months ago
0
update shells

#112 WeiweiZhang1 closed 3 months ago
0
chinese LLMs update and hf links

#111 yintong-lu closed 3 months ago
0
20% speedup by removing new zero tensor

#110 wenhuach21 closed 3 months ago
0
hook AutoHfQuantizer of transformers to support different backends and mixed precision quantization

#109 wenhuach21 closed 3 months ago
1
large discrepancy between GPTQ model and qdq model

#108 wenhuach21 closed 2 months ago
1
Adjust the default eval dtype by selecting from the model_dir config

#107 WeiweiZhang1 closed 3 months ago
0
1.8X speedup by disable_low_gpu_mem_usage and reduce memory usage by avoid using torch.cat

#106 wenhuach21 closed 3 months ago
0
Consolidate dataloader&dataset_split to dataset

#105 wenhuach21 closed 4 months ago
1
Set the default scale_dtype to FP16

#104 wenhuach21 closed 3 months ago
1
cohere model support request

#103 MichoChan closed 3 months ago
8
disable quantizing lm-head with tied weights as a workaround

#102 wenhuach21 closed 4 months ago
0
disable quantizing lm-head with tied weights as a workaround

#101 wenhuach21 closed 4 months ago
0
OPT model quantize_lm_head clarification

#100 Qubitium closed 3 months ago
3
Merge dataloader to dataset

#99 wenhuach21 closed 4 months ago
1
update readme of calibration dataset and lm-head usage

#98 wenhuach21 closed 4 months ago
0
fix critic bug for gradient_accumulate_steps!=1 and reduce cpu memory of lm-head tuning

#97 WeiweiZhang1 closed 4 months ago
0
Add marlin and modify acc.md

#96 pursure-D closed 3 months ago
1
fix README typos

#95 yintong-lu closed 4 months ago
0
README typo fix

#94 yintong-lu closed 4 months ago
0
handle invalid layername in weight_config

#93 WeiweiZhang1 closed 4 months ago
0
Lm head quant, align with main branch

#92 WeiweiZhang1 closed 4 months ago
0

Previous Next