Open kakascode opened 5 months ago
Can you provide what trt logged during the build, and possible the build script?
Can you provide what trt logged during the build, and possible the build script?
thanks,bro. I truncated the last part because it was too long.
[03/29/2024-17:37:32] [TRT] [V] Setting a default quantization params because quantization data is missing for {ForeignNode[onnx::Gather_401...(Unnamed Layer 3201) [ElementWise]]}
[03/29/2024-17:37:32] [TRT] [V] Tactic: 0x0000000000000000 Time: 59.9303
[03/29/2024-17:37:32] [TRT] [V] {ForeignNode[onnx::Gather_401...(Unnamed Layer 3201) [ElementWise]]} (Myelin[0x80000023]) profiling completed in 165.507 seconds. Fastest Tactic: 0x0000000000000000 Time: 59.9303
[03/29/2024-17:37:32] [TRT] [V] >>>>>>>>>>>>>>> Chose Runner Type: Myelin Tactic: 0x0000000000000000
[03/29/2024-17:37:32] [TRT] [V] =============== Computing reformatting costs
[03/29/2024-17:37:32] [TRT] [V] =============== Computing reformatting costs
[03/29/2024-17:37:32] [TRT] [V] =============== Computing reformatting costs
[03/29/2024-17:37:32] [TRT] [V] =============== Computing reformatting costs
[03/29/2024-17:37:32] [TRT] [V] =============== Computing reformatting costs
[03/29/2024-17:37:32] [TRT] [V] =============== Computing reformatting costs:
[03/29/2024-17:37:32] [TRT] [V] Autotuning Reformat: Half(49,1) -> Float(49,1)
[03/29/2024-17:37:32] [TRT] [V] --------------- Timing Runner: Optimizer Reformat(
others: when I set FP16 in front set INT8,it will fallback FP16
[03/29/2024-17:37:32] [TRT] [V] Autotuning Reformat: Half(49,1) -> Float(49,1)
[03/29/2024-17:37:33] [TRT] [V] Adding reformat layer: Reformatted Output Tensor 0 to {ForeignNode[onnx::Gather_401...(Unnamed Layer* 3201) [ElementWise]]} (output) from Half(49,1) to Float(49,1)
How many layers are affected? Since, it could be a necessary reformat layer that tensorrt adds at I/O. Refer to this for more info: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#reformat-free-network-tensors
Please share the whole log, and .onnx file in a google drive for further help.
Had the same issue with a Reformat layer #2136
[03/29/2024-17:37:32] [TRT] [V] Autotuning Reformat: Half(49,1) -> Float(49,1)
[03/29/2024-17:37:33] [TRT] [V] Adding reformat layer: Reformatted Output Tensor 0 to {ForeignNode[onnx::Gather_401...(Unnamed Layer* 3201) [ElementWise]]} (output) from Half(49,1) to Float(49,1)
How many layers are affected? Since, it could be a necessary reformat layer that tensorrt adds at I/O. Refer to this for more info: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#reformat-free-network-tensors
Please share the whole log, and .onnx file in a google drive for further help.
Had the same issue with a Reformat layer #2136
I am sorry for my late response, i will check you method, thanks for help
When I use TensorRT for int8 quantization, I always encounter the accuracy fallback to fp32. The trt.BuilderFlag.OBEY_PRECISION_CONSTRAINTS parameter does not solve the issue. What should I do?"
What is your trtexec cmd ?
When I use TensorRT for int8 quantization, I always encounter the accuracy fallback to fp32. The trt.BuilderFlag.OBEY_PRECISION_CONSTRAINTS parameter does not solve the issue. What should I do?"
What is your trtexec cmd ?
I didn't use the trtexec command; instead, I used my own script.
builder = trt.Builder(logger)
network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
parser = trt.OnnxParser(network, logger)
config = builder.create_builder_config()
config.max_workspace_size = (1 << 30) * 8
config.set_flag(trt.BuilderFlag.FP16)
config.set_flag(trt.BuilderFlag.INT8)
config.set_flag(trt.BuilderFlag.OBEY_PRECISION_CONSTRAINTS)
config.int8_calibrator = calib```
if I don't use `config.set_flag(trt.BuilderFlag.FP16)`, it will fallback fp32, otherwise, it will fallback fp16
Description
When I use TensorRT for int8 quantization, I always encounter the accuracy fallback to fp32. The trt.BuilderFlag.OBEY_PRECISION_CONSTRAINTS parameter does not solve the issue. What should I do?"
Environment
TensorRT Version:8.6.16
NVIDIA GPU: A100
CUDA Version:11.4
Operating System:
Python Version (if applicable):3.7
PyTorch Version (if applicable):1.12.1