apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.77k stars 6.8k forks source link

Mxnet TRT INT8 problems #20134

Open steven12356789 opened 3 years ago

steven12356789 commented 3 years ago

HI, I use ResNeSt model to train my own dataset. from this following link : https://github.com/zhanghang1989/ResNeSt#transfer-learning-models. Now I can transform model to ONNX without any error. But when I want to use tensorRT to speed up inference and use C++ to export onnx to int8.

My terminal shows like these: WARN TRT: No implementation of layer (Unnamed Layer 69) [Shuffle] + Transpose_52 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation. trt_utils.cpp:253 WARN TRT: No implementation of layer ReduceSum_68 obeys the requested constraints in strict mode. No conforming implementation was found i.e requested layer computation precision and output precision types are ignored, using the fastest implementation. trt_utils.cpp:253 WARN TRT: No implementation of layer ReduceSum_103 obeys the requested constraints in strict mode. No conforming implementation was found i.e requested layer computation precision and output precision types are ignored, using the fastest implementation. trt_utils.cpp:253 WARN TRT: No implementation of layer (Unnamed Layer 158) [Shuffle] obeys the requested constraints in strict mode. No conforming implementa WARN TRT: No implementation of layer Softmax_116 obeys the requested constraints in strict mode. No conforming implementation was found i.e requested layer computation precision and output precision types are ignored, using the fastest implementation. trt_utils.cpp:253 WARN TRT: No implementation of layer (Unnamed Layer 160) [Shuffle] + Transpose_117 obeys the requested constraints in strict mode. No conforming implementation was found i.e requested layer computation precision and output precision types are ignored, using the fastest implementation. trt_utils.cpp:253 WARN TRT: No implementation of layer ReduceSum_133 obeys the requested constraints in strict mode. No conforming implementation was found i.e requested layer computation precision and output precision types are ignored, using the fastest implementation. trt_utils.cpp:253 WARN TRT: No implementation of layer ReduceSum_166 obeys the requested constraints in strict mode. No conforming implementation was found i.e requested layer computation precision and output precision types are ignored, using the fastest implementation. trt_utils.cpp:253 WARN TRT: No implementation of layer (Unnamed Layer 247) [Shuffle] obeys the requested constraints in strict mode. No conforming implementation was found i.e requested layer computation precision and output precision types are ignored, using the fastest implementation. trt_utils.cpp:253 WARN TRT: No implementation of layer Softmax_179 obeys the requested constraints in strict mode. No conforming implementation was found i.e requested layer computation precision and output precision types are ignored, using the fastest implementation. trt_utils.cpp:253 WARN TRT: No implementation of layer (Unnamed Layer* 249) [Shuffle] + Transpose_180 obeys the requested constraints in strict mode. No conforming implementation was found i.e requested layer computation precision and output precision types are ignored, using the fastest implementation. trt_utils.cpp:253 WARN TRT: No implementation of layer ReduceSum_196 obeys the requested constraints in strict mode. No conforming implementation was found i.e requested layer computation precision and output precision types are ignored, using the fastest implementation. trt_utils.cpp:253 WARN TRT: No implementation of layer ReduceSum_229 obeys the requested constraints in strict mode. No conforming implementation was found i.e requested layer computation precision and output precision types are ignored, using the fastest implementation. trt_utils.cpp:253

So, Does that mean that I can not use INT8?

Environment

onnx 1.7.0 onnxruntime 1.5.2 tensorrt 7.2.1.4 cuda version 11.1

github-actions[bot] commented 3 years ago

Welcome to Apache MXNet (incubating)! We are on a mission to democratize AI, and we are glad that you are contributing to it by opening this issue. Please make sure to include all the relevant context, and one of the @apache/mxnet-committers will be here shortly. If you are interested in contributing to our project, let us know! Also, be sure to check out our guide on contributing to MXNet and our development guides wiki.

szha commented 3 years ago

@MoisesHer do you know who can best help with this?

TristonC commented 3 years ago

@Kh4L to follow up this.