-
Hi,
Can you share the rotation+gptq ppl data? is it better than smoothquant+gptq? Many tks!
-
开发机:ubuntu 20.04 mnn 3.0.0
模型 huggingface:Qwen2.5-0.5B-Instruct 和 Qwen2.5-0.5B-Instruct-GPTQ-Int8
## 导出 onnx 模型
$ python mnn/transformers/llm/export/llmexport.py --path pretrained_model/Qwen2.5…
-
### System Info
A100
### Who can help?
_No response_
### Information
- [x] The official example scripts
- [ ] My own modified scripts
### Tasks
- [x] An officially supported task in the `exampl…
-
**What**
- We propose supporting the GPTQ algorithm, a state-of-the-art post-training quantization (PTQ) method that has demonstrated robust performance,
effectively compressing weights. Notably, G…
-
### Your current environment
The output of `python collect_env.py`
N/A; happened to multiple users.
### Model Input Dumps
_No response_
### 🐛 Describe the bug
We have been receiving re…
-
### System Info
root@laion-gaudi2-00:/home/sdp# docker run -p 8081:80 -v $volume:/data --runtime=habana -e HUGGING_FACE_HUB_TOKEN=$hf_token -e PT_HPU_ENABLE_LAZY_COLLECTIVES=true -e HABANA_VISIBLE_DE…
-
Running ``quantize.py`` with ``--mode int4-gptq`` does not seem to work:
- code tries to import ``lm-evaluation-harness`` which is not included/documented/used
- import in ``eval.py`` is incorrect…
-
### Describe the issue
I am trying to enable AWQ support with IPEX repo in CPU.
IPEX 2.5.0 [release](https://github.com/intel/intel-extension-for-pytorch/releases) states that it has the supp…
-
### System Info
DataBricks with the following packages:
```
transformers: 4.45.2
huggingface-hub: 0.26.1
accelerate: 1.0.1
optimum: 1.23.1
auto-gptq: 0.7.1
bitsandbytes: 0.44.1
```
### Rep…
-
Hi @Qubitium . Since the CPU path is already in gptqmodel, when do you plan to replace auto_gptq to gptqmodel in HuggingFace/optimum? I think we can start an issue in Optimum to let the maintainer kno…