jeng1220 / KerasToTensorRT

This is a simple demonstration for running Keras model model on Tensorflow with TensorRT integration(TFTRT) or on TensorRT directly without invoking "freeze_graph.py".
67 stars 23 forks source link

Problem with Infer function #3

Open doublexxking opened 6 years ago

doublexxking commented 6 years ago

Hi Jeng, Thanks for your code firstly, I am trying to use your trt_example.py code to optimize my own h5 model. I change
frozen_graph = FrozenGraph(model, (img_h, img_w, 1)) to frozen_graph = FrozenGraph(model, (img_h, img_w, 3)) due to my input is RGB image.

But I got the following error:

2018-11-15 22:05:33.644284: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2018-11-15 22:05:33.644333: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-11-15 22:05:33.644341: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 
2018-11-15 22:05:33.644347: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N 
2018-11-15 22:05:33.644463: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5221 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:41:00.0, compute capability: 6.1)
2018-11-15 22:05:34.015739: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.68GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2018-11-15 22:05:34.574931: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.18GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
Tensorflow time 2.8744256496429443
PASSED
2018-11-15 22:05:36.607992: I tensorflow/core/grappler/devices.cc:51] Number of eligible GPUs (core count >= 8): 1
2018-11-15 22:05:36.608614: I tensorflow/core/grappler/clusters/single_machine.cc:359] Starting new session
2018-11-15 22:05:36.608906: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2018-11-15 22:05:36.608932: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-11-15 22:05:36.608939: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 
2018-11-15 22:05:36.608945: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N 
2018-11-15 22:05:36.609049: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5221 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:41:00.0, compute capability: 6.1)
2018-11-15 22:05:36.662117: I tensorflow/contrib/tensorrt/convert/convert_graph.cc:853] MULTIPLE tensorrt candidate conversion: 2
2018-11-15 22:05:36.662873: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2957] Segment @scope '', converted to graph
2018-11-15 22:05:36.662891: E tensorflow/contrib/tensorrt/convert/convert_graph.cc:418] Can't find a device placement for the op!
2018-11-15 22:05:36.668956: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2957] Segment @scope '', converted to graph
2018-11-15 22:05:36.668984: E tensorflow/contrib/tensorrt/convert/convert_graph.cc:418] Can't find a device placement for the op!
2018-11-15 22:05:44.596046: I tensorflow/contrib/tensorrt/convert/convert_graph.cc:952] Engine my_trt_op_0 creation for segment 0, composed of 302 nodes succeeded.
2018-11-15 22:05:44.668720: I tensorflow/contrib/tensorrt/convert/convert_graph.cc:952] Engine my_trt_op_1 creation for segment 1, composed of 4 nodes succeeded.
2018-11-15 22:05:44.707948: W tensorflow/contrib/tensorrt/convert/trt_optimization_pass.cc:185] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2018-11-15 22:05:44.716689: W tensorflow/contrib/tensorrt/convert/trt_optimization_pass.cc:185] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2018-11-15 22:05:44.727196: W tensorflow/contrib/tensorrt/convert/trt_optimization_pass.cc:185] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2018-11-15 22:05:44.730129: W tensorflow/contrib/tensorrt/convert/trt_optimization_pass.cc:185] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2018-11-15 22:05:44.732993: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:501] Optimization results for grappler item: tf_graph
2018-11-15 22:05:44.733013: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:503]   constant folding: Graph size after: 318 nodes (0), 327 edges (0), time = 7.31ms.
2018-11-15 22:05:44.733033: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:503]   layout: Graph size after: 323 nodes (5), 332 edges (5), time = 10.301ms.
2018-11-15 22:05:44.733039: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:503]   TensorRTOptimizer: Graph size after: 19 nodes (-304), 19 edges (-313), time = 8025.59717ms.
2018-11-15 22:05:44.733044: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:503]   constant folding: Graph size after: 19 nodes (0), 19 edges (0), time = 7.144ms.
2018-11-15 22:05:44.733053: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:503]   TensorRTOptimizer: Graph size after: 19 nodes (0), 19 edges (0), time = 10.891ms.
2018-11-15 22:05:44.733059: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:501] Optimization results for grappler item: my_trt_op_0_native_segment
2018-11-15 22:05:44.733064: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:503]   constant folding: Graph size after: 303 nodes (0), 311 edges (0), time = 8.643ms.
2018-11-15 22:05:44.733069: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:503]   layout: Graph size after: 303 nodes (0), 311 edges (0), time = 6.329ms.
2018-11-15 22:05:44.733074: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:503]   TensorRTOptimizer: Graph size after: 303 nodes (0), 311 edges (0), time = 1.044ms.
2018-11-15 22:05:44.733079: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:503]   constant folding: Graph size after: 303 nodes (0), 311 edges (0), time = 7.665ms.
2018-11-15 22:05:44.733084: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:503]   TensorRTOptimizer: Graph size after: 303 nodes (0), 311 edges (0), time = 1.017ms.
2018-11-15 22:05:44.733089: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:501] Optimization results for grappler item: my_trt_op_1_native_segment
2018-11-15 22:05:44.733094: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:503]   constant folding: Graph size after: 5 nodes (0), 4 edges (0), time = 2.906ms.
2018-11-15 22:05:44.733099: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:503]   layout: Graph size after: 5 nodes (0), 4 edges (0), time = 1.602ms.
2018-11-15 22:05:44.733104: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:503]   TensorRTOptimizer: Graph size after: 5 nodes (0), 4 edges (0), time = 0.471ms.
2018-11-15 22:05:44.733109: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:503]   constant folding: Graph size after: 5 nodes (0), 4 edges (0), time = 2.455ms.
2018-11-15 22:05:44.733114: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:503]   TensorRTOptimizer: Graph size after: 5 nodes (0), 4 edges (0), time = 0.517ms.
2018-11-15 22:05:44.794659: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2018-11-15 22:05:44.794710: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-11-15 22:05:44.794717: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 
2018-11-15 22:05:44.794723: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N 
2018-11-15 22:05:44.794826: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5221 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:41:00.0, compute capability: 6.1)
Traceback (most recent call last):
  File "tftrt.py", line 172, in <module>
    main()
  File "tftrt.py", line 159, in main
    y_tftrt = tftrt_engine.infer(x_test)
  File "tftrt.py", line 69, in infer
    y = np.empty((num_tests, self.y_tensor.shape[1]), np.float32)
TypeError: __index__ returned non-int (type NoneType)

I am not sure what the problem is. Any help from you is appreciate.

jeng1220 commented 6 years ago

need more log, maybe try

print(x.shape)
print(self.y_tensor.shape)
jeng1220 commented 6 years ago

BTW, I assume that you meant tftrt_example.py instead of trt_example.py

doublexxking commented 6 years ago

x.shape is (85, 224, 224, 3) y_tensor.shape is unknown

umm....I also print y.shape, which is (85,) So, I think the frozen graph has some problem? which comes from frozen_graph = FrozenGraph(model, (img_h, img_w, 3)) But I may not sure how to change your code. Could you share some your ideas?

Thanks in advance

doublexxking commented 6 years ago

If I change self.y_tensor.shape[1] to 17, which is my number of classes, the program will run successfully. But I also want to know what the problem of in your original code.

blackarrow3542 commented 5 years ago

First, thanks for a great example of using TensorRT with Keras and TensorFlow!

I have similar error with tftrt_resnet_example.py. I have to set self.y_tensor.shape[1] to 1000. I'm using TensorRT 4 with tensorflow-gpu 1.11 I think the reason is that. with the TfEngine(object): This is y_op.outputs: [<tf.Tensor 'import/resnet50/fc1000/Softmax:0' shape=(?, 1000) dtype=float32>]

With the TftrtEngine(TfEngine): The opt_graph is missing the shape of the output tensor. This is y_op.outputs: [<tf.Tensor 'import/resnet50/fc1000/Softmax:0' shape=<unknown> dtype=float32>]

Fanshia commented 5 years ago

Have the same issue. Shape is missing for TftrtEngine