-
## ❓ Question
When I'm not using TensorRT, I run my model through an FX interpreter that times each call op (by inserting CUDA events before/after and measuring the elapsed time). I'd like to do so…
-
## Bug Description
Perform int8 quantization on resnet50 in the reference Test-demo ( https://github.com/pytorch/TensorRT/tree/master/tests/py/ptq ), and compare the inference result with the origin…
-
**Describe the bug**
timm0.6.7 version, using timm's resnet50, first convert the model to onnx, and then use tenosrrt's model PTQ quantization function, the quantized model is verified on the val (50…
-
I follow this [https://github.com/NVIDIA/TensorRT/blob/release/8.6/quickstart/quantization_tutorial/qat-ptq-workflow.ipynb](url) converted q_model() to onnx format, I want use python api convert onnx …
-
I am trying to use the [coral.ai USB Accelerator](https://coral.ai/products/accelerator) within a [Proxmox](https://www.proxmox.com/proxmox-virtual-environment) VM using docker command:
`sudo docker …
-
I'm trying to quantize yolox-l model and convert to int8. However, after I conver to int8 version onnx and conver to engine, fp16 is faster than int8 version. Can you take a look at my onnx? This onnx…
-
### bug描述 Describe the Bug
### 错误信息
错误引入 PR:https://github.com/PaddlePaddle/Paddle/pull/50915
case 地址:https://github.com/PaddlePaddle/PaddleTest/tree/develop/inference/python_api_test/test_nlp_…
-
I have tried [this official example of Smooth quant alpha auto tuning](https://github.com/intel/neural-compressor/tree/master/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/ptq…
-
问题:请问PaddleSpeech/examples/csmsc/voc5中PTQ_static.sh跑不通
我在运行PaddleSpeech/examples/csmsc/voc5的run.sh的过程中,在stage=3时程序报错。
对应运行命令:
![image](https://user-images.githubusercontent.com/68834517/223012996-b…
-
## Bug Description
When attempting to use TRT to PTQ quantize a 2B parameter GPT-neo model. I keep encountering the following error message
`[executionContext.cpp::commonEmitDebugTensor::1269…