triton-inference-server / paddlepaddle_backend

BSD 3-Clause "New" or "Revised" License
32 stars 6 forks source link

Paddle TensorRT配置错误 #17

Open qiu-pinggaizi opened 1 year ago

qiu-pinggaizi commented 1 year ago

image 不适用TensorRT推理,配置文件如下,可以正常推理。 name: "test" backend: "paddle" input [ { name: "input" data_type: TYPE_FP32 dims: [ 3, 896, 896 ] } ] output [ { name: "conv2d_59.tmp_1" data_type: TYPE_FP32 dims: [ 3, 896, 896 ] } ] instance_group [ { count: 1 kind: KIND_GPU } ] dynamic_batching { preferred_batch_size: [ 2, 4 ] max_queue_delay_microseconds: 0 } 配置 TensorRT推理时,启动失败,配置文件和错误如下: name: "test" backend: "paddle" input [ { name: "input" data_type: TYPE_FP32 dims: [ 3, 896, 896 ] } ]

output [ { name: "conv2d_59.tmp_1" data_type: TYPE_FP32 dims: [ 3, 896, 896 ] } ] instance_group [ { count: 1 kind: KIND_GPU } ] dynamic_batching { preferred_batch_size: [ 2, 4 ] max_queue_delay_microseconds: 0 } optimization { execution_accelerators { gpu_execution_accelerator : [ { name : "tensorrt" parameters { key: "precision" value: "trt_fp16" } parameters { key: "min_graph_size" value: "4" } parameters { key: "workspace_size" value: "1073741824" } parameters { key: "enable_tensorrt_oss" value: "0" } parameters { key: "is_dynamic" value: "1" } }, { name : "min_shape" parameters { key: "input" value: "1 3 896 896" } }, { name : "max_shape" parameters { key: "input" value: "2 3 896 896" } }, { name : "opt_shape" parameters { key: "input" value: "1 3 896 896" } } ] } } 错误信息: WARNING: Logging before InitGoogleLogging() is written to STDERR I0306 08:51:06.968530 2126 analysis_config.cc:1336] In CollectShapeInfo mode, we will disable optimizations and collect the shape information of all intermediate tensors in the compute graph and calculate the min_shape, max_shape and opt_shape. Segmentation fault (core dumped)

测试examples中的ERNIE模型,也是这个错误WARNING: Logging before InitGoogleLogging() is written to STDERR I0306 08:51:06.968530 2126 analysis_config.cc:1336] In CollectShapeInfo mode, we will disable optimizations and collect the shape information of all intermediate tensors in the compute graph and calculate the min_shape, max_shape and opt_shape. Segmentation fault (core dumped)

这是什么原因呢 @ZeyuChen @jeng1220 @

heliqi commented 1 year ago

@qiu-pinggaizi 你的paddle版本是多少? 用的是文档中的镜像还是自己编译的?

qiu-pinggaizi commented 1 year ago

@heliqi 自己编译的,triton镜像版本是nvcr.io/nvidia/tritonserver:22.07-py3;paddle版本是release/2.4

qiu-pinggaizi commented 1 year ago

是高版本不支持trt加速吗 @heliqi

heliqi commented 1 year ago

支持,应该是新版本接口有改动。 你换2.3试试?

qiu-pinggaizi commented 1 year ago

ok