jeng1220 / KerasToTensorRT

This is a simple demonstration for running Keras model model on Tensorflow with TensorRT integration(TFTRT) or on TensorRT directly without invoking "freeze_graph.py".
67 stars 23 forks source link

Error with INT8 #4

Open blackarrow3542 opened 5 years ago

blackarrow3542 commented 5 years ago

Hi, Thanks for the great example code! I'm trying to compare speed between native, FP32 and INT8 with tftrt_resnet_example.py. Prediction time for resnet50 with gtx 1080ti I find for same method if I add a second infer, 2nd infer is much faster since the 1st infer need to load some function. Batch_size 128 Keras 2.9s -> 0.375 tensorflow 1.4s ->0.297 TFTRT FP32 2.5s->0.172 TFTRT INT8 4.3s (Failed with second infer) python3: helpers.cpp:56: nvinfer1::DimsCHW nvinfer1::getCHW(const nvinfer1::Dims&): Assertion `d.nbDims >= 3' failed. However I got this error if I use INT8 with batch_size=128

If I set Batch size to 1 Keras 2.5s 0.011s tensorflow 0.87s -> 0.007s TFTRT FP32 1.98s -> 0.004s error with TFTRT INT8: python3: helpers.cpp:56: nvinfer1::DimsCHW nvinfer1::getCHW(const nvinfer1::Dims&): Assertion `d.nbDims >= 3' failed. The INT8 will not work at all. Do you know how should I modify the code to use INT8?

blackarrow3542 commented 5 years ago

My guess is for INT8 I have to calibrate tftrt_graph and then use tftrt.calib_graph_to_infer_graph(tftrt_graph) to get the infer graph.

tabsun commented 5 years ago

@blackarrow3542 Have you solved the problem? I encountered the same error, and it seems I need to calibrate the graph but I have no idea how to make it. Any advice?