Paddle TensorRT配置错误

qiu-pinggaizi commented 1 year ago

不适用TensorRT推理，配置文件如下，可以正常推理。 name: "test" backend: "paddle" input [ { name: "input" data_type: TYPE_FP32 dims: [ 3, 896, 896 ] } ] output [ { name: "conv2d_59.tmp_1" data_type: TYPE_FP32 dims: [ 3, 896, 896 ] } ] instance_group [ { count: 1 kind: KIND_GPU } ] dynamic_batching { preferred_batch_size: [ 2, 4 ] max_queue_delay_microseconds: 0 } 配置 TensorRT推理时，启动失败，配置文件和错误如下： name: "test" backend: "paddle" input [ { name: "input" data_type: TYPE_FP32 dims: [ 3, 896, 896 ] } ]

output [ { name: "conv2d_59.tmp_1" data_type: TYPE_FP32 dims: [ 3, 896, 896 ] } ] instance_group [ { count: 1 kind: KIND_GPU } ] dynamic_batching { preferred_batch_size: [ 2, 4 ] max_queue_delay_microseconds: 0 } optimization { execution_accelerators { gpu_execution_accelerator : [ { name : "tensorrt" parameters { key: "precision" value: "trt_fp16" } parameters { key: "min_graph_size" value: "4" } parameters { key: "workspace_size" value: "1073741824" } parameters { key: "enable_tensorrt_oss" value: "0" } parameters { key: "is_dynamic" value: "1" } }, { name : "min_shape" parameters { key: "input" value: "1 3 896 896" } }, { name : "max_shape" parameters { key: "input" value: "2 3 896 896" } }, { name : "opt_shape" parameters { key: "input" value: "1 3 896 896" } } ] } } 错误信息： WARNING: Logging before InitGoogleLogging() is written to STDERR I0306 08:51:06.968530 2126 analysis_config.cc:1336] In CollectShapeInfo mode, we will disable optimizations and collect the shape information of all intermediate tensors in the compute graph and calculate the min_shape, max_shape and opt_shape. Segmentation fault (core dumped)

测试examples中的ERNIE模型，也是这个错误WARNING: Logging before InitGoogleLogging() is written to STDERR I0306 08:51:06.968530 2126 analysis_config.cc:1336] In CollectShapeInfo mode, we will disable optimizations and collect the shape information of all intermediate tensors in the compute graph and calculate the min_shape, max_shape and opt_shape. Segmentation fault (core dumped)

这是什么原因呢 @ZeyuChen @jeng1220 @

heliqi commented 1 year ago

@qiu-pinggaizi 你的paddle版本是多少？用的是文档中的镜像还是自己编译的？

qiu-pinggaizi commented 1 year ago

@heliqi 自己编译的，triton镜像版本是nvcr.io/nvidia/tritonserver:22.07-py3；paddle版本是release/2.4

qiu-pinggaizi commented 1 year ago

是高版本不支持trt加速吗 @heliqi

heliqi commented 1 year ago

支持，应该是新版本接口有改动。你换2.3试试？

qiu-pinggaizi commented 1 year ago

ok

triton-inference-server / paddlepaddle_backend

Paddle TensorRT配置错误 #17