issues
search
OpenGVLab
/
OmniQuant
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
MIT License
626
stars
49
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Quantize LLAMA-2-7b-chat to W4A4
#37
nmyuchen
opened
7 months ago
4
Update omniquant.py
#36
brisker
closed
7 months ago
1
Problems with memory usage and model loading
#35
Forival
closed
6 months ago
1
about decode speed and gpu memory usage
#34
tro0o
closed
6 months ago
1
Enforce minimum CLIPMIN value for the scale.
#33
radi-cho
closed
8 months ago
1
Quick Clarification Question on C4 PPL
#32
HanGuo97
closed
8 months ago
7
Loss is NAN, stopping training
#31
Forival
closed
8 months ago
2
Is evaluation on MMLU dataset supported?
#30
brisker
closed
6 months ago
13
RuntimeError when quantize bloom using our code
#29
Louym
opened
8 months ago
0
Fix ChatModule initalization with model_lib_path argument
#28
kaushikthedeveloper
closed
8 months ago
1
Regarding the Initialization of `smooth_scale` for the Q*K Operation
#26
superdocker
closed
9 months ago
2
Results Errors
#25
yileijin
closed
9 months ago
10
Reduce shape for per group weight calibration
#24
Alvant
closed
9 months ago
2
Failed to compile AutoGPTQ-bugfix
#23
caseylai
closed
2 months ago
1
How to add a new model for OmniQuant?
#22
gesanqiu
closed
2 months ago
5
Cannot compile with mlc-llm
#21
0x1997
opened
9 months ago
2
Model File Formats: .pth, .bin vs. GGUF
#20
sebvannistel
opened
9 months ago
0
Slow decoding compared to AWQ
#19
abhinavkulkarni
closed
9 months ago
7
Runing quantized models with MLC-LLM error
#18
silvacarl2
closed
4 months ago
3
Runing Falcon-180B on a single A100 80GB where/what is main.py?
#17
silvacarl2
closed
9 months ago
2
‼️Llama2-70b not working
#16
zhiwei-dong
closed
9 months ago
8
mlc notebook provided doesnt run on colab
#15
githubpradeep
closed
4 months ago
3
falcon 180B generates garbage on A100
#14
githubpradeep
closed
10 months ago
5
quantize custom model trained on alpaca-liked dataset
#13
ghost
closed
9 months ago
3
aug_loss option in OmniQuant Scripts
#12
MarsJacobs
closed
10 months ago
15
How to quantize a llama structure model and run it with sampling process?
#11
gesanqiu
closed
10 months ago
3
Quant script for large models like 180B and 70B models?
#10
yhyu13
closed
10 months ago
3
Minor improvements
#9
jeethu
closed
10 months ago
1
why quantize opt-1.3b or llama 7b with W8A8 config loss is nan?
#8
MeJerry215
closed
10 months ago
14
MLC Android app is missing storage permission
#7
remixer-dec
closed
10 months ago
2
Lazy loading
#6
Ar57m
opened
10 months ago
0
how to run Android app of release v0.0.1
#5
946166920
closed
9 months ago
1
Trying to run models following docs; incomplete?
#4
lhl
closed
10 months ago
2
how to inference in llama.cpp?
#3
lucasjinreal
closed
6 months ago
9
Fix typo in README.md
#2
eltociear
closed
10 months ago
0
Inquiry about Activation Quantization Strategy in Inference
#1
lucfisc
closed
10 months ago
2
Previous