quantizing Search Results

1000+ results
for quantizing

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ThisisBillhe/tiny-stable-diffusion #4

quantize my own model to 2bits

Thank you for your efforts. I'm curious to know if there are any codes or scripts for quantizing my own 2-bit stable diffusion models, rather than relying on the pre-existing model available on Goog…

mason5957 updated 7 months ago
1
AlexeyAB/darknet #8142

Quantizing yolov4 weight file to FP16 or to INT8

Hi @AlexeyAB, Is there are a way to quantize yolov4 weight file to FP16 or to INT8 without using tflite?

AhmedShamli updated 2 years ago
2
pytorch/torchtune #1020

QLoRA Inference

Can I load QLoRA fine-tuning weights into a Hugging Face model as shown below? ```python model_id = "meta-llama/Meta-Llama-3-8B-Instruct" quantization_config = BitsAndBytesConfig( load_in_4bit=T…

jeff52415 updated 2 months ago
1
microsoft/onnxruntime #14997

[Feature Request] 4bit and 2bit and 1bit quantization suppor…

### Describe the feature request Support for quantizing and running quantized models in 4bit, 2bit and 1bit. Also saving and loading these models in onnx format for lower file sizes. The GPU doesn…

elephantpanda updated 1 month ago
24
intel/caffe #283

weight quantization

Hello! I am new to Intel Caffe! As i read Intel document "LOWER NUMERICAL PRECISION DEEP LEARNING INFERENCE AND TRAINING". It said that "**quantizing the weights is done before inference starts. Qua…

minhson updated 5 years ago
2
huggingface/transformers #31474

Quantization support for heads and embeddings

### Feature request Hi! I’ve been researching LLM quantization recently ([this paper](https://arxiv.org/abs/2405.14852)), and noticed a potentially improtant issue that arises when using LLMs with 1-…

galqiwi updated 1 week ago
14
yhhhli/BRECQ #19

yolov5 Quantitative problem

Thank you very much for your work， I refer to your code modification yolov5，When w4a8 quantizing There are nearly 3points of loss，Have you experimented yolov5

w4087165 updated 9 months ago
2
AutoGPTQ/AutoGPTQ #582

Loss Nan when I quant Qwen1.5 14B chat model

I used auto_gptq 0.7.1 and run this code: python quant_with_alpaca.py --pretrained_model_dir Qwen1.5-14B-Chat --quantized_model_dir Qwen1.5-14B-Chat_4bit --use_triton --save_and_reload --trust_remote…

Minami-su updated 4 months ago
6
vllm-project/vllm #7200

[Bug]: loading fp16 model as fp8 quantized caused OOM

### Your current environment (venv-vllm-54) (base) root@I1ba088648b009018e4:/hy-tmp# nvidia-smi Tue Aug 6 10:29:16 2024 +--------------------------------------------------------------------…

AlphaINF updated 1 month ago
2
google/qkeras #60

Kernel and Bias Quantizers not quantizing weights and biases

I trained a qkeras model with kernel and bias quantizers for every QDense layer as `quantized_bits(8,0)`. After training, I print out the weights and biases of the QDense layers. I expect them to h…

r1bhu updated 2 years ago
11

上一页 1...6 7 8 9 10 11 12...100 下一页

1000+ results for quantizing

1000+ results
for quantizing