-
-
### Checklist
- I have searched related issues but cannot get the expected help.
- I have read related documents and don't know what to do.
### Describe the question you meet
\[here\]
###…
-
in the process of yolov8 int8 quant, i find that some layers(int8) is slower than fp16, and the reformat operation is very time-consuming, for best presion, we can do sensitive-layer analysise to get …
-
Hi, I'm working on applying QAT on a model. I made the necessary modifications. However, when I looked into one of the saved checkpoint `.pth` files, I observed that none of the weights were actually …
-
## Description
I generated calibration cache for Vision Transformer onnx model using EntropyCalibration2 method. When trying to generate engine file using cache file for INT8 precision using trte…
-
I am working on applying Quantization-Aware Training (QAT) with various parameters to optimize my model. During this process, I ran into an issue when attempting to use certain configuration parameter…
-
### 1、Questions
As we Known, SD v1.5 has 1 Billions params , and it's peek GPU memory is about 4G at the precison fp32.
So, the memory of int4 precison (sd_w4a8_chpt.pth) will be about 4G/8 = 500…
-
Firstly, thanks to all of you for the bravo project!
Currently, the model seems like does not support int8 quantization. Any plan on it?
-
Thank you for sharing these valuable experiment. I am now evaluate the accurancy about SnapKV\Pyramid and your methods. Basically, Pyramid is a little better than SnapKV, so I think that Ada-Pyramid-G…
-
Does Minicpmv2.6 currently support int8/fp8 quantization?
thanks~