-
Thanks for sharing great research.
I tried to reproduce the results of paper and got next problem.
I tested Llama2-7b 4-16-16 (RTN) with `10_optimize_rotation.sh` and got wikitext-2 ppl 5.5, which…
-
Post-training quantization (PTQ) - without finetune and Quantization aware training (QAT) works fine but
get error in Post-training quantization (PTQ) - fast finetune:
activation = layer.layer.acti…
-
**What**
- We propose supporting the GPTQ algorithm, a state-of-the-art post-training quantization (PTQ) method that has demonstrated robust performance,
effectively compressing weights. Notably, G…
-
### Search before asking
- [X] I have searched the YOLOv6 [issues](https://github.com/meituan/YOLOv6/issues) and found no similar feature requests.
### Description
post training quantization using…
-
1. X2bolt -d onnx -m model -i PTQ #输出为model_ptq_input.bolt
2. ./post_training_quantization -p model_ptq_input.bolt -i INT8_FP32 -b true -q NOQUANT -c 0 -o false
3. 推理报错如下:
[ERROR] thread 121948 fil…
-
Thank you for the amazing work. I was able to setup the BEVFusion inference using the model files given in the readme.
I want to use this pipeline for BEVFusion trained on my dataset, so as per the […
-
I want to use qat method for my model, but i can only find ptq quantizer in executorch, are there some examples of how to implement Quantization Aware Training (QAT) for qnn backend?
-
When trying to repeat your code, we find that when inferencing using default fp16, the peak memory goes with:
about 9800MB
But when inferencing using W8A8(after PTQ), the peak memory goes with:
…
-
## Description
I want to finetune a quantized yolo model, and export to TRT.
I carefully read the QDQ document and some existed issues to place and remove unused QDQ nodes, the model have 92% int8…
-
# PTQ | Download video bằng tool ffmpeg trên Windows
Đây là 1 bài chia sẻ vui! Dành cho bạn nào chưa biết thì ffmpeg là cái tool xịn xò dùng để tải video ở những nơi…không thể tải bằng cách thông thư…