OpenGVLab OmniQuant issues

OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

MIT License

626 stars 49 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Quantize LLAMA-2-7b-chat to W4A4

#37 nmyuchen opened 7 months ago
4
Update omniquant.py

#36 brisker closed 7 months ago
1
Problems with memory usage and model loading

#35 Forival closed 6 months ago
1
about decode speed and gpu memory usage

#34 tro0o closed 6 months ago
1
Enforce minimum CLIPMIN value for the scale.

#33 radi-cho closed 8 months ago
1
Quick Clarification Question on C4 PPL

#32 HanGuo97 closed 8 months ago
7
Loss is NAN, stopping training

#31 Forival closed 8 months ago
2
Is evaluation on MMLU dataset supported?

#30 brisker closed 6 months ago
13
RuntimeError when quantize bloom using our code

#29 Louym opened 8 months ago
0
Fix ChatModule initalization with model_lib_path argument

#28 kaushikthedeveloper closed 8 months ago
1
Regarding the Initialization of `smooth_scale` for the Q*K Operation

#26 superdocker closed 9 months ago
2
Results Errors

#25 yileijin closed 9 months ago
10
Reduce shape for per group weight calibration

#24 Alvant closed 9 months ago
2
Failed to compile AutoGPTQ-bugfix

#23 caseylai closed 2 months ago
1
How to add a new model for OmniQuant?

#22 gesanqiu closed 2 months ago
5
Cannot compile with mlc-llm

#21 0x1997 opened 9 months ago
2
Model File Formats: .pth, .bin vs. GGUF

#20 sebvannistel opened 9 months ago
0
Slow decoding compared to AWQ

#19 abhinavkulkarni closed 9 months ago
7
Runing quantized models with MLC-LLM error

#18 silvacarl2 closed 4 months ago
3
Runing Falcon-180B on a single A100 80GB where/what is main.py?

#17 silvacarl2 closed 9 months ago
2
‼️Llama2-70b not working

#16 zhiwei-dong closed 9 months ago
8
mlc notebook provided doesnt run on colab

#15 githubpradeep closed 4 months ago
3
falcon 180B generates garbage on A100

#14 githubpradeep closed 10 months ago
5
quantize custom model trained on alpaca-liked dataset

#13 ghost closed 9 months ago
3
aug_loss option in OmniQuant Scripts

#12 MarsJacobs closed 10 months ago
15
How to quantize a llama structure model and run it with sampling process?

#11 gesanqiu closed 10 months ago
3
Quant script for large models like 180B and 70B models?

#10 yhyu13 closed 10 months ago
3
Minor improvements

#9 jeethu closed 10 months ago
1
why quantize opt-1.3b or llama 7b with W8A8 config loss is nan?

#8 MeJerry215 closed 10 months ago
14
MLC Android app is missing storage permission

#7 remixer-dec closed 10 months ago
2
Lazy loading

#6 Ar57m opened 10 months ago
0
how to run Android app of release v0.0.1

#5 946166920 closed 9 months ago
1
Trying to run models following docs; incomplete?

#4 lhl closed 10 months ago
2
how to inference in llama.cpp?

#3 lucasjinreal closed 6 months ago
9
Fix typo in README.md

#2 eltociear closed 10 months ago
0
Inquiry about Activation Quantization Strategy in Inference

#1 lucfisc closed 10 months ago
2