quantizing Search Results

1000+ results
for quantizing

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/accelerate #2871

Cannot free VRAM after loading a quantized model

### System Info ```Shell - `Accelerate` version: 0.31.0 - Platform: Linux-5.15.0-79-generic-x86_64-with-glibc2.35 - `accelerate` bash location: /home/lstein/test_ckpts/SD3/.venv/bin/accelerate …

lstein updated 3 months ago
4
NVIDIA/FasterTransformer #265

INT8 Support for GPT models

I see that there is full int8 support (both weights and activations) for BERT, its not clear to me what is supported for GPT models ([here](https://github.com/NVIDIA/FasterTransformer/blob/main/exampl…

bharatv007 updated 1 year ago
25
huggingface/text-generation-inference #2130

[RFC]Add Auto-Round Support

Hi, here is the INC team from Intel. Thank you for developing this amazing project. ### Motivation Our team has developed a new weight-only quantization algorithm called Auto-Round. It has achie…

yiliu30 updated 3 months ago
19
hpcaitech/Open-Sora #128

[feat] Transformer quantization (bitsandbytes, AutoGPTQ and …

As many lovers of local LLMs know, their raw (fp16) weights are hard to set up on a consumer PC. Luckily, there are some techniques enabling to quantize the weights to 4bits and even lower, making the…

kabachuha updated 5 months ago
4
BlackSamorez/tensor_parallel #88

Question on custom models

Hi, without using transformers / accelerate blablabla, what are the constraints on the model to be tensor paralelizable ? does it need to be a nn.Sequential ? does input dimensions need to be alwa…

vince62s updated 1 year ago
23
huggingface/transformers #32577

Gemma2 GGUF: `modeling_gguf_pytorch_utils.py: ValueError: Ar…

### System Info - `transformers` version: 4.44.0 - Platform: Linux-6.5.0-44-generic-x86_64-with-glibc2.35 - Python version: 3.10.12 - Huggingface_hub version: 0.24.5 - Safetensors version: 0.4.…

alllexx88 updated 3 weeks ago
9
huggingface/transformers #31762

Converting gguf fp16 & bf16 to hf is not supported.

### System Info ``` transformers==4.42.3 torch==2.3.0 numpy==1.26.4 gguf==0.6.0 ``` ### Who can help? @SunMarc ### Information - [ ] The official example scripts - [X] My own mod…

PenutChen updated 1 month ago
10
Xilinx/Vitis-AI #1291

vitis-ai 3.0 pytorch quantize failed: list index out of rang…

Hi. I'd like to quantize [SSD](https://github.com/YutaroOgawa/pytorch_advanced/blob/master/2_objectdetection/utils/ssd_model.py) by using [xilinx/vitis-ai-pytorch-cpu:ubuntu2004-3.0.0.106](https://hu…

takeshiho0531 updated 1 year ago
3
opensearch-project/k-NN #1890

[RFC] Design to Store Quantization State

# Design Strategy for Quantization State Persistence ## Introduction This document outlines the design for storing the quantization state. It includes essential information on the thresholds require…

Vikasht34 updated 1 month ago
1
Aesthisia/LLMinator #42

Initial execution stops at quantize_model

Hey, I went by the readme and did the install (into a venv) and a make, but upon starting the webui the execution stops and returns a python interpreter prompt without getting to gradio or loading …

jozsefszalma updated 5 months ago
1

上一页 1...88 89 90 91 92 93 94...100 下一页

1000+ results for quantizing

1000+ results
for quantizing