Closed zhangqijun closed 5 years ago
What does you model configuration look like (config.pbtxt).
Closing, please provide the requested information and re-open if you are still hitting the issue.
this is config.pbtxt
name: "arcfaceint8" platform: "tensorflow_graphdef" max_batch_size: 16 input [ { name: "Placeholder" data_type: TYPE_FP32 dims: [ 112, 112 ,3] } ] output [ { name: "embd_extractor/BatchNorm_1/Reshape_1" data_type: TYPE_FP32 dims: [ 512 ] } ] dynamic_batching { preferred_batch_size: [ 8, 16 ] max_queue_delay_microseconds: 100 }
Did you generate the TF-TRT model using the same version of TensorRT as is being used by TRTIS? The easiest way to do this is to use the TensorRT container from the same release as the TRTIS you are using (for example, 19.07).
Any updates on this issue? I have the same issue on tensorflow/serving:1.14.0-gpu and nvcr.io/nvidia/tensorflow:19.10-py3.
I got the same error, when using tensorflow 1.15 + TF-TRT 5.1.5 Train and inference works fine. Here are some logs when do inference.
But when I deploy the tf-trt int8 optimization model with some warm up data. This error happened.
The FP32 or FP16 optimization deploying works fine. Here is the log for FP16 optimization deploying.
This bug is really strange.
@zhangqijun Sorry to bother you, do you have a solution to this problem?
@minhdeal Sorry to bother you, do you have a solution to this problem?
I use tf-trt int8 optimization model to start a server with nvcr.io/nvidia/tensorrtserver:19.07-py3. When I inference use simple_client.py got this error,and server down. But,inference use simple_client.py with same model with fp32 or fp16 precision is right. use same int8 optimization model with nvcr.io/nvidia/tensorrtserver:19.02-py3,inference is right. sorry about my Chinglish.
pb model is generated by this code:
with graph.as_default():
with tf.Session(config=config) as sess:
with tf.gfile.GFile("arcface.pb", "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
trt_graph = trt.create_inference_graph(input_graph_def=graph_def, outputs=['embd_extractor/BatchNorm_1/Reshape_1'], max_batch_size=8, max_workspace_size_bytes=2 << 20, precision_mode="int8")
import cv2
with tf.Session(graph=tf.Graph(),config=config) as sess:
output_node = tf.import_graph_def(trt_graph, return_elements=['embd_extractor/BatchNorm_1/Reshape_1'])[0]
images = tf.get_default_graph().get_tensor_by_name("import/Placeholder:0")
output_tenser = tf.get_default_graph().get_tensor_by_name("import/embd_extractor/BatchNorm_1/Reshape_1:0")
df = pd.read_csv("/media/ssd/predev/face_verifacation/dairy_test/dairy_verifacation_align.csv")
for i in range(len(df)):
img_path = os.path.join("/media/ssd/predev/face_verifacation/dairy_test/images_align/img",df.loc[i,"A1"])
img = cv2.imread(img_path).astype(np.float32)[:,:,::-1]
print(sess.run(output_tenser,feed_dict={images:[img]}).shape)
trt_int8_calibrated_graph = trt.calib_graph_to_infer_graph(trt_graph,is_dynamic_op=True) tf.train.write_graph(trt_int8_calibrated_graph, './', 'int8model.graphdef', as_text=False)