quantization Search Results

1000+ results
for quantization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

fpgaminer/joycaption #3

How to run with BNB 4bit or 8bit quantization?

I tryed to modify your example code to run this model on lowvram card by BNB 4bit or 8bit quantization config. While use bnb 4bit config like below: ```python qnt_config = BitsAndBytesConfig(load…

fireicewolf updated 1 week ago
7
pytorch/pytorch #141015

When loading a model trained with QAT `(qat.pth)` using `mod…

### 🐛 Describe the bug - I'm reporting this issue due to errors related to capture_pre_autograd_graph and torch.compile in QAT. - Note: Apologies if there are any misunderstandings. - Based on th…

Taku-777 updated 14 hours ago
1
microsoft/VPTQ #126

How to Generate a 2-bit Quantized Meta-Llama-3.1-8B-Instruct…

I found a [similar closed issue](https://github.com/microsoft/VPTQ/issues/56) related to this topic. Following your reply in that issue, I successfully configured the `vptq-algo` environment based on …

ForAxel updated 23 hours ago
3
levipereira/yolov9-qat #16

Quantization for YOLOv9 Segmentation Models

Hi, I’m using YOLOv9 for segmentation tasks and noticed that quantization is currently supported for object detection models. Since the backbone is the same across all YOLOv9 variants, I wanted to …

GokceSengun updated 1 month ago
3
milankl/LinLogQuantization.jl #12

Extending Linear Quantization for Signed Integers

hej, First of all, congratulations on the nice job you've done with this package😺 Secondly, I was wondering if you would be willing to accept an extension of linear quantization to support sign…

pabvald updated 1 month ago
6
NVIDIA/TensorRT #4212

stable diffusion quantization in inpainting task is poor

i have completed stable diffusion quantization in txt2img as demo shows. the result is very good. when i want to transfer sd quantization in inpainting task, i meet the problem that the quantization r…

worhar updated 1 week ago
3
pytorch/ao #1104

MX-scale discrepancy during quantization and dequantization

In the case of very small numbers input numbers around the subnormal range of `torch.float` or `torch.bfloat16`, the scale exponent will take its smallest unbiased value: `-127`. However, you only all…

mariosfourn updated 1 month ago
1
vllm-project/vllm #2948

AWQ Quantization Memory Usage

Hello! First of all, great job with this inference engine! Thanks a lot for your work! Here's my issue: I have run vllm with both a mistral instruct model and it's AWQ quantized version. I've quant…

vcivan updated 3 weeks ago
5
NVIDIA/TensorRT-LLM #2445

Build Qwen2-72B-Instruct model by INT4-AWQ quantization fail…

### System Info Ubuntu 20.04 NVIDIA A100 nvcr.io/nvidia/tritonserver:24.10-trtllm-python-py3 and 24.07 TensorRT-LLM v0.14.0 and v0.11.0 ### Who can help? @Tracin ### Information - [x] The offici…

wangpeilin updated 5 days ago
1
vllm-project/vllm #3348

inference with AWQ quantization

Hi, i got an anomaly while inference mistral with AWQ, below is the GPU usage on 3090 consume 20GB GPU. even if we inference the base model only consume 19GB GPU here is the command: python -m vl…

Kev1ntan updated 3 weeks ago
3

上一页 1...8 9 10 11 12 13 14...100 下一页

1000+ results for quantization

1000+ results
for quantization