Closed oscarriddle closed 5 years ago
@oscarriddle I am looking for a solution to combine the best features of tensorflow serving and tensorRT. Could you please explain how you compiled Tesnorflow Serving with tenorRT?
@R-Miner I did nothing actually special about compiling tensorRT, my mainly modification is focused on compiling other dependencies like python, cuda, etc. As I mentioned above, the compilation automatically check the TensorRT's version but eventually seems not actually compiled it in. It is a little weird.
https://github.com/tensorflow/serving/blob/master/tensorflow_serving/model_servers/BUILD#L283
add a line
"@org_tensorflow//tensorflow/contrib/tensorrt:trt_engine_op_kernel",
SUPPORTED_TENSORFLOW_OPS = [
"@org_tensorflow//tensorflow/contrib:contrib_kernels",
"@org_tensorflow//tensorflow/contrib:contrib_ops_op_lib",
"@org_tensorflow//tensorflow/contrib/tensorrt:trt_engine_op_kernel",
]
then , I find TensorRT in dynamic link so file.
0000000006ec8f60 V _ZTSN10tensorflow8tensorrt10TRTCalibOpE
0000000006ec9160 V _ZTSN10tensorflow8tensorrt11TRTEngineOpE
0000000006ec9320 V _ZTSN10tensorflow8tensorrt17TRTInt8CalibratorE
0000000006ec8fa0 V _ZTSN10tensorflow8tensorrt22TRTCalibrationResourceE
hope useful.
this is a hack way. I think it is better, change BUILD use if_tensorrt
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/tensorrt/BUILD
@qiaohaijun Sorry to bother, I have success build serving with tensorRT, but when I started serving, the is one log:
2018-09-13 15:43:38.647435: E external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:1199] OpKernel ('op: "TRTEngineOp" device_type: "GPU"') for unknown op: TRTEngineOp
how to solve this? and how do I know if tensorRT take effective?
Thanks advance.
@qiaohaijun I have met the same problem as @ydp . I compiled serving 1.9 with tensorRT 4.0.1.6 , cuda 9, cudnn7 successfully, and find .so like below.
root@A02-R12-I160-19:/serving# ldd serving-trt |grep nv libnvinfer.so.4 => /usr/lib/TensorRT-4.0.1.6/lib/libnvinfer.so.4 (0x00007f06e4bb7000) libnvidia-fatbinaryloader.so.384.81 => /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.384.81 (0x00007f06c1a01000)
And when I run serving, error occured.
2018-09-18 03:40:31.354473: E external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:1242] OpKernel ('op: "TRTEngineOp" device_type: "GPU"') for unknown op: TRTEngineOp 2018-09-18 03:40:31.354519: E external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:1242] OpKernel ('op: "TRTCalibOp" device_type: "GPU"') for unknown op: TRTCalibOp
Any ideas to solve this? Thx.
I have fixed my problem by adding lines in external/org_tensorflow/tensorflow/contrib/tensorrt/BUILD as commented. But still have no idea if tensorRT works.
cc_library(
name = "trt_engine_op_kernel",
srcs = [
"kernels/trt_calib_op.cc",
"kernels/trt_engine_op.cc",
"ops/trt_calib_op.cc",
"ops/trt_engine_op.cc",
"shape_fn/trt_shfn.cc",
#three lines above
],
hdrs = [
"kernels/trt_calib_op.h",
"kernels/trt_engine_op.h",
"shape_fn/trt_shfn.h",
#one line above
],
copts = tf_copts(),
visibility = ["//visibility:public"],
deps = [
":trt_logging",
":trt_plugins",
":trt_resources",
"//tensorflow/core:gpu_headers_lib",
"//tensorflow/core:lib_proto_parsing",
"//tensorflow/core:stream_executor_headers_lib",
] + if_tensorrt([
"@local_config_tensorrt//:nv_infer",
]) + tf_custom_op_library_additional_deps(),
# TODO(laigd)
alwayslink = 1, # buildozer: disable=alwayslink-with-hdrs
)
so sorry everyone. I find my solution is failed.
export TF_CPP_MIN_VLOG_LEVEL=3
then, I find
2018-09-30 17:23:37.479683: I external/org_tensorflow/tensorflow/core/framework/op.cc:103]
Not found: Op type not registered 'my_trt_op_0_native_segment' in binary running on xxx.
Make sure the Op and Kernel are registered in the binary running in this process.
Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.)
`tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
Is this still an issue ?
@harshini-gadige compile with no errors, but unable to know if TensorRT works as expected. according to @qiaohaijun 's comments, compile no errors, but not really success, since it fail at runtime.
Hi guys, Any progress for it? thanks
@lilao Any inputs please ?
@elvys-zhang @qiaohaijun @Chris19920210 @ydp @oscarriddle @harshini-gadige @lilao Hi dear all, I also meet this exact same issue like you guys did. Does anyone find a solution for this? Thank you so much if any suggestion. It's driving me crazy~TOT
@elvys-zhang @qiaohaijun @Chris19920210 @ydp @oscarriddle @harshini-gadige @lilao Hi dear all, I also meet this exact same issue like you guys did. Does anyone find a solution for this? Thank you so much if any suggestion. It's driving me crazy~TOT
BTW, I'm using tf1.12+trt4.0.x, compiling works fine but sort of TRTEngineOp not registered at runtime.
@qiaohaijun @Chris19920210 @ydp @oscarriddle @lilao
Dear all, does anyone could tell me how to export a int8 savedModel for tf-serving?
@rankeey Currently exporting int8 saved model is not supported yet, but will be supported once TF 2.0 is out. Please let me know if there are any questions.
also note, upcoming TensorFlow 1.13 release will have official support for TensorRT, and i'd strongly recommend using this release (or the nightly) for any testing rather than older versions of TF Serving.
Update here: support of exporting int8 saved model is added by https://github.com/tensorflow/tensorflow/commit/fd481c1af898fa5a587d09e9505fcd273bcf18da, see here for how to run the export.
Thanks.
Closing this issue as it is in "awaiting response" status for more than 7 days. If you are facing any new issue, please create a new github request which helps us to address it correctly. If you still want to update here, please post your comments so that we will review and reopen(if required).
Have I written custom code : No OS Platform and Distribution: CentOS 7 TensorFlow installed from: source TensorFlow version: tensorflow-serving branch r1.7 Bazel version: 0.11.1 CUDA/cuDNN version: CUDA9.0, cuDNN 7.0.5, TensorRT4.0.4 (actually)
I tried to compile the Tensorflow-serving r1.7 with TensorRT 4.0.4, and the compilation is successfully done.
But when I start the service and load a TFTRT optimized model, I get error:
Looks like the TRTEngineOp is still not supported by this execution file. Though I'm not 100% sure about my way of compiling Tensorflow-serving 1.7 with TRT, but I think the compilation indeed searched and found the libnvinfer.so, etc, and also checked the TensorRT version is correct. So I don't know why the binary executive file still can't support TRTEngineOp.
Here is my environment variables:
This is my compilation command:
I'm not sure whether my procedure is correct. Really few of docs can be found that talk about how to build the tensorflow-serving 1.7 with tensorrt. Can any clue member can help me?
I think I've almost got there!
Thanks,
PS: The tensorrt source code is downloaded from NVIDIA official website, which tar file is named "TensorRT-3.0.4.Ubuntu-14.04.5.x86_64.cuda-9.0.cudnn7.0.tar.gz". The weird thing is, I unpacked the tar file and find the actually version is 4.0.4 not 3.0.4. So in the tensorflow-serving-r1.7, I need to set the variable TF_TENSORRT_VERSION=4.0.4 to avoid version check failure.
I encountered below 2 errors and solved them, so I think the bazel compilation shall indeed compiled the TensorRT. Post here as an evidence.
This is the error when I set wrong TENSORRT_LIB_PATH, (can't find libnvinfer.so):
This is when the TF_TENSORRT_VERSION is not the same with found libnvinfer: