model-quantization Search Results

cmeraki/indri #27

Model quantization

Fp8 or AWQ quant

romitjain updated 1 week ago

city96/ComfyUI-GGUF #161

“flux1-fill-dev” model quantization request

https://huggingface.co/black-forest-labs/FLUX.1-Fill-dev/blob/main/flux1-fill-dev.safetensors

Amazon90 updated 1 day ago

ollama/ollama #7289

VPTQ Model Quantization Support in Ollama

Hi all, We recently developed a fully open-source quantization method called VPTQ (Vector Post-Training Quantization) [https://github.com/microsoft/VPTQ](https://github.com/microsoft/VPTQ) which en…

YangWang92 updated 1 month ago

vllm-project/llm-compressor #926

Got Error when I load a 2of4 model using vllm.

**Describe the bug** I'm compressing a qwen2.5_7b model using `examples/quantization_2of4_sparse_w4a16/llama7b_sparse_w4a16.py`, but I failed to load the stage_sparsity model. The error is shown belo…

jiangjiadi updated 2 days ago

quic/aimet #3439

quantization simulation for a LLM model example

Hello， I'm trying to use AIMET_TORCH to quantize a LLM model, e.g.: llama v2。 where can I find a jupyter NB example which shows quantization simulation for a LLM model?

1826133674 updated 1 week ago

pytorch/torchchat #1362

linear:int4 quantization regression testing

### 🚀 The feature, motivation and pitch In the past, we padded int4 quantization with non-multiple group size to make things work. Since we have decided to remove the padding, int4 quantization is n…

mikekgfb updated 1 week ago

SYSTRAN/faster-whisper #1168

Greater error when converted via ctranslate2

I fine-tuned a Whisper large-v3 model via [speechbrain](https://github.com/speechbrain/speechbrain) framework. I want to convert it to `faster-whisper` model and run inference on it via `faster-whispe…

hforghani updated 2 days ago

vllm-project/vllm #10294

[Feature]: Quark quantization format upstream to VLLM

Quark is a comprehensive cross-platform toolkit designed to simplify and enhance the quantization of deep learning models. Supporting both PyTorch and ONNX models, Quark empowers developers to optimiz…

kewang-xlnx updated 3 days ago

Deci-AI/super-gradients #2060

yolo nas Post training quantization and quantization awarene…

### 💡 Your Question I have followed exactly same steps for model training followed by PTQ and QAT mentioned in the offcial super-gradient notebook : https://github.com/Deci-AI/super-gradients/blob…

anazkhan updated 2 weeks ago

pytorch/executorch #6711

[Quantization Size Issue] Regarding the point at which the m…

Hi @shewu-quic ~ Could you tell me after which method the actual physical size of the model reduces when we perform 8a8w quantization on the Llama-3.2-1B & 3B models using a QNN backend? Thank …

crinex updated 1 week ago

1000+ results for model-quantization

1000+ results
for model-quantization