tensorflow / tensorrt

TensorFlow/TensorRT integration
Apache License 2.0
737 stars 225 forks source link

Tf2.1.0, tensorrt7, batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 0, but engine max batch size was: 1 #191

Open jxhekang opened 4 years ago

jxhekang commented 4 years ago

I try to generate tf-trt model by using tf2.1.0's TrtGraphConverterV2(xxxx) interface. In tf2.1.0's TrtGraphConverterV2, the is_dynamic_op can only be Ture, which means the tf-trt model can handle input images of different size dynamicly. At first, I got one tf-trt model(model_A) successfully, and it seems work well and fast. However, when I changed a few parameters in my net and re-generate the tf-trt model(model_B) , the new tf-trt model became unstable. For example: it can do inference when I feed images(batch=1,H=1000,W=600, C=3) to the tf-trt model, but when I feed other images(such as:batch=1,H=1024,W=600, C=3;or batch=1,H=512,W=512, C=3), I got the error like below.....

tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:38] DefaultLogger Can't fuse pad and convolution with same pad mode tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:38] DefaultLogger Can't fuse pad and convolution with caffe pad mode tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:42] DefaultLogger Parameter check failed at: ../builder/builder.cpp::setMaxBatchSize::135, condition: batchSize > 0 && batchSize <= MAX_BATCH_SIZE tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:38] DefaultLogger Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:736] Building a new TensorRT engine for StatefulPartitionedCall/retina_net_module/retina_net_post_processor/TRTEngineOp_18 with input shapes: [[3,4]] tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:38] DefaultLogger Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:736] Building a new TensorRT engine for StatefulPartitionedCall/retina_net_module/retina_net_post_processor/TRTEngineOp_17 with input shapes: [[0,4]] tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:42] DefaultLogger Parameter check failed at: ../builder/builder.cpp::setMaxBatchSize::135, condition: batchSize > 0 && batchSize <= MAX_BATCH_SIZE tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:38] DefaultLogger Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:42] DefaultLogger Parameter check failed at: engine.cpp::enqueue::292, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 0, but engine max batch size was: 1 tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:635] Failed to enqueue batch for TRT engine: TRTEngineOp_0 tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:506] Failed to execute engine, retrying with native segment for TRTEngineOp_0 tensorflow/core/framework/op_kernel.cc:875] Check failed: mutable_output(index) == nullptr (0x7fb7c2b26c00 vs. nullptr) Aborted (core dumped)

I tried to convert my model_B in nvidia's docker(nvcr.io/nvidia/tensorflow:20.02-tf2-py3), but still got almost same error. The error notice that Batch size was: 0, but engine max batch size was: 1, I really don't know where does this batch-0 come from.... Is there any one meet the error like this?

jxhekang commented 4 years ago

After some tests, I got the answer why tf-trt-model unstable is that: It's because the input image used by me is one synthetic image(im_list = [128 * np.ones([576, 1024, 3]).astype(np.float32)]). In this image, all pixel's data is constant 128. The tf-trt-model can't get valid information to pass to some kind of trt-op, and generate some error tensor(batchsize=0) by mistake.

I think this is a bug needed to be fixed in tf-trt, because it's hard to guarantee that the input images always have valid object or information. The 'Batch size was:0' error will result in program abort,this will be one hidden danger in tf-trt....

jxhekang commented 4 years ago

@pooyadavoodi I notice your reply about 'Batch size was:0' error in the link below: https://github.com/tensorflow/tensorflow/issues/33184#issuecomment-567605513 But I still can't get enough information from the tf-trt's log to handle my 'Batch size was:0' error. In the tf-trt's log, I know the error happend in TRTEngineOp_0. However, in tf2.1.0, there is no black-node-list option. The only way I can do is to set --minimum_segment_size 40, and it works,the 'Batch size was:0' error didn't happend any more, but it also may lead to tf-trt‘s inefficient. Hope your team can handle this error in the next tf-trt version.

sanjoy commented 4 years ago

@bixia1