quantizing Search Results

1000+ results
for quantizing

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

liuliu/s4nnc #24

Quantizing q8p

I was quantizing weights using : ``` graph.openStore( full_f16_path, flags: .truncateWhenClose ) { store in let keys = store.keys graph.openStore( f8_path, flags: .truncateWhen…

ghost updated 1 month ago
5
NVIDIA/TensorRT-Model-Optimizer #30

Error when quantizing onnx model

Hi, This error occurred when I tried to quantize my onnx model. ``` Traceback (most recent call last): File "quant.py", line 4, in quantize( File "/usr/local/lib/python3.8/dist-packages…

de1star updated 1 week ago
7
casper-hansen/AutoAWQ #577

use autoawq quantizing Qwen2-72B-Instruct error

File "/home/qx/.local/lib/python3.10/site-packages/awq/models/base.py", line 231, in quantize self.quantizer.quantize() File "/home/qx/.local/lib/python3.10/site-packages/awq/quantize/quantize…

ving666 updated 2 weeks ago
3
mlc-ai/mlc-llm #2866

[Bug] Unused External Parameters when quantizing Command-R-P…

## 🐛 Bug ![image](https://github.com/user-attachments/assets/5253f5fc-8cbb-4b9f-8e33-674865f09164) ## To Reproduce Steps to reproduce the behavior: 1. Download Command-R-Plus (either varia…

TNT3530 updated 2 weeks ago
3
ggerganov/llama.cpp #9007

Bug: Quantizing a bog standard llama is failing - Error: Err…

### What happened? I was running gguff on https://huggingface.co/pints-ai/1.5-Pints-16K-v0.1/tree/main It's bog standard llama. It should have quanted but it's failing. ### Name and Version What…

devlux76 updated 1 week ago
1
casper-hansen/AutoAWQ #574

Error when quantizing the Qwen2-7B model

**When I quantized the model of Qwen2-7B (not fine-tuned) using the quantization code below, I got the following error：** **quantization code** ```python from awq import AutoAWQForCausalLM from tr…

XiaoYu2022 updated 1 month ago
1
ROCm/AMDMIGraphX #3307

[INT4] Compress model by quantizing weights to int4

### Idea Use int4 as the compression technique to fit larger models onto Navi machines or possibly MI series machines. Weights would be compressed using encoding scheme that would pack two 4 bits n…

umangyadav updated 1 day ago
23
AutoGPTQ/AutoGPTQ #137

Advice for quantizing BLOOMZ 175B

Hi @PanQiWei I'd be most grateful if you could give me a bit of help. I have been trying to quantize BLOOMZ 175B but can't currently get it done. BLOOMZ has 70 layers, and is a total of 360GB.…

TheBloke updated 1 month ago
10
bitsandbytes-foundation/bitsandbytes #1256

[QUESTION] Quantizing in a different way...

Hello! I did some research (using llama.cpp) and I found out that quantizing the input and embed tensors to f16 and the other tensors to q5_k or q6_k gives excellent results and almost indistinguisha…

0wwafa updated 2 months ago
1
casper-hansen/AutoAWQ #571

error when quantizing my finetuned 405b model using autoawq

Package Version: AutoAWQ: 0.2.5+cu118 torch: 2.3.1+cu118 transformers: 4.43.3 I was try to quantize my finetuned llama3.1 405b (bf16) model to 4 bit using autoawq following the insturction in t…

Atomheart-Father updated 1 month ago
7

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for quantizing

1000+ results
for quantizing