intel auto-round issues

intel / auto-round

Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"

https://arxiv.org/abs/2309.05516

Apache License 2.0

132 stars 18 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Fix exlllamav2 backend issue

#144 wenhuach21 closed 1 month ago
0
refine the code

#143 wenhuach21 closed 1 month ago
0
fix exporting typo

#142 yintong-lu closed 1 month ago
1
if the whole block is excluded from the quantization, bug will occur

#141 wenhuach21 closed 1 month ago
1
fix typos

#140 WeiweiZhang1 closed 1 month ago
0
bump version into v0.2

#139 chensuyue closed 1 month ago
0
question about calib data

#138 mxjmtxrm opened 1 month ago
14
Fix asym kernel issue by following autogptq's pr

#137 wenhuach21 closed 1 month ago
1
Add layer wise mode to save memory

#136 n1ck-guo closed 1 week ago
0
support low cpu memory usage

#135 wenhuach21 closed 5 days ago
1
support trainable equivalent transformation

#134 wenhuach21 opened 1 month ago
0
support simulated MXPF4

#133 wenhuach21 closed 2 days ago
1
support activation quantization

#132 wenhuach21 closed 5 days ago
1
support multimodal models

#131 wenhuach21 opened 1 month ago
0
handling transformers version compatibility in lmhead export, bugfix

#130 WeiweiZhang1 closed 1 month ago
2
fix export issue with torch 2.0

#129 wenhuach21 closed 1 month ago
0
Update falcon recipe

#128 wenhuach21 closed 1 month ago
0
Adjust gpu usage based on free gpu memory space

#127 WeiweiZhang1 opened 1 month ago
0
fix falcon quant issue with disable_trust_remote_code

#126 WeiweiZhang1 closed 1 month ago
0
falcon 7b bug with disable_trust_remote_code

#125 wenhuach21 closed 1 month ago
1
Update phi2 recipe

#124 wenhuach21 closed 1 month ago
0
remove fp32 conversion in exporting to autogptq

#123 wenhuach21 closed 1 month ago
0
Remove unused precommit hook

#122 XuehaoSun closed 1 month ago
0
update gemma recipe

#121 wenhuach21 closed 1 month ago
0
Fix export format issue

#120 wenhuach21 closed 1 month ago
0
Fix doc

#119 wenhuach21 closed 1 month ago
0
support `transformers.Conv1D` packing

#118 Kaihui-intel closed 1 month ago
1
fix lm-head quant issue at disable_quanted_input

#117 wenhuach21 closed 1 month ago
0
Unexpected ppl diff

#116 YihengBrianWu closed 3 weeks ago
4
support autoawq format

#115 yintong-lu opened 1 month ago
1
support real lm-head quantization and mixed precision inference

#114 wenhuach21 closed 1 month ago
0
fix lm-head gradient accumulation bug

#113 wenhuach21 closed 2 months ago
0
update shells

#112 WeiweiZhang1 closed 2 months ago
0
chinese LLMs update and hf links

#111 yintong-lu closed 2 months ago
0
20% speedup by removing new zero tensor

#110 wenhuach21 closed 2 months ago
0
hook AutoHfQuantizer of transformers to support different backends and mixed precision quantization

#109 wenhuach21 closed 1 month ago
1
large discrepancy between GPTQ model and qdq model

#108 wenhuach21 closed 1 month ago
1
Adjust the default eval dtype by selecting from the model_dir config

#107 WeiweiZhang1 closed 2 months ago
0
1.8X speedup by disable_low_gpu_mem_usage and reduce memory usage by avoid using torch.cat

#106 wenhuach21 closed 2 months ago
0
Consolidate dataloader&dataset_split to dataset

#105 wenhuach21 closed 2 months ago
1
Set the default scale_dtype to FP16

#104 wenhuach21 closed 2 months ago
1
cohere model support request

#103 MichoChan closed 2 months ago
8
disable quantizing lm-head with tied weights as a workaround

#102 wenhuach21 closed 2 months ago
0
disable quantizing lm-head with tied weights as a workaround

#101 wenhuach21 closed 2 months ago
0
OPT model quantize_lm_head clarification

#100 Qubitium closed 1 month ago
3
Merge dataloader to dataset

#99 wenhuach21 closed 2 months ago
1
update readme of calibration dataset and lm-head usage

#98 wenhuach21 closed 2 months ago
0
fix critic bug for gradient_accumulate_steps!=1 and reduce cpu memory of lm-head tuning

#97 WeiweiZhang1 closed 2 months ago
0
Add marlin and modify acc.md

#96 pursure-D closed 1 month ago
1
fix README typos

#95 yintong-lu closed 2 months ago
0

Previous Next