gptq Search Results - Githubissues

HandH1998/QQQ #20

rotation+gptq data

Hi， Can you share the rotation+gptq ppl data? is it better than smoothquant+gptq? Many tks!

Andy0422 updated 1 month ago

alibaba/MNN #3095

Qwen2.5-0.5B-Instruct 运行apply_lora.py失败

开发机：ubuntu 20.04 mnn 3.0.0 模型 huggingface：Qwen2.5-0.5B-Instruct 和 Qwen2.5-0.5B-Instruct-GPTQ-Int8 ## 导出 onnx 模型 $ python mnn/transformers/llm/export/llmexport.py --path pretrained_model/Qwen2.5…

jfduma updated 1 week ago

NVIDIA/TensorRT-LLM #2454

Convert Qwen2-0.5b Failed When Using INT4 GPTQ

### System Info A100 ### Who can help? _No response_ ### Information - [x] The official example scripts - [ ] My own modified scripts ### Tasks - [x] An officially supported task in the `exampl…

ReginaZh updated 22 hours ago

Samsung/ONE #13480

[circle-quantizer] Support GPTQ

**What** - We propose supporting the GPTQ algorithm, a state-of-the-art post-training quantization (PTQ) method that has demonstrated robust performance, effectively compressing weights. Notably, G…

01000-you updated 1 day ago

vllm-project/vllm #10656

[Bug]: Qwen2.5-32B-GPTQ-Int4 inference `!!!!!`

### Your current environment The output of `python collect_env.py` N/A; happened to multiple users. ### Model Input Dumps _No response_ ### 🐛 Describe the bug We have been receiving re…

jklj077 updated 2 days ago

huggingface/tgi-gaudi #207

GPTQ uint4 quantization broken

### System Info root@laion-gaudi2-00:/home/sdp# docker run -p 8081:80 -v $volume:/data --runtime=habana -e HUGGING_FACE_HUB_TOKEN=$hf_token -e PT_HPU_ENABLE_LAZY_COLLECTIVES=true -e HABANA_VISIBLE_DE…

endomorphosis updated 3 weeks ago

pytorch-labs/gpt-fast #12

GPTQ quantization not working

Running ``quantize.py`` with ``--mode int4-gptq`` does not seem to work: - code tries to import ``lm-evaluation-harness`` which is not included/documented/used - import in ``eval.py`` is incorrect…

lopuhin updated 2 weeks ago

intel/intel-extension-for-pytorch #736

How to enable support for AWQ ?

### Describe the issue I am trying to enable AWQ support with IPEX repo in CPU. IPEX 2.5.0 ⁠[release](https://github.com/intel/intel-extension-for-pytorch/releases) states that it has the supp…

Pradeepa99 updated 12 hours ago

huggingface/transformers #34317

GPTQ Quantization reduces number of parameters by a lot (Fac…

### System Info DataBricks with the following packages: ``` transformers: 4.45.2 huggingface-hub: 0.26.1 accelerate: 1.0.1 optimum: 1.23.1 auto-gptq: 0.7.1 bitsandbytes: 0.44.1 ``` ### Rep…

manitadayon updated 6 days ago

ModelCloud/GPTQModel #536

Replace auto_gptq by gptqmodel in HuggingFace/Optimum

Hi @Qubitium . Since the CPU path is already in gptqmodel, when do you plan to replace auto_gptq to gptqmodel in HuggingFace/optimum? I think we can start an issue in Optimum to let the maintainer kno…

jiqing-feng updated 1 week ago

1000+ results for gptq

1000+ results
for gptq