ERROR in bash perf_ernie.sh,SUCCESS in bash perf_resnet50_v1.5.sh

triton-inference-server / paddlepaddle_backend

BSD 3-Clause "New" or "Revised" License

32 stars 6 forks source link

E1003 14:03:23.654152 91 helper.h:114] 3: [executionContext.cpp::setOptimizationProfileInternal::755] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setOptimizationProfileInternal::755, condition: profileIndex >= 0 && profileIndex < mEngine.getNbOptimizationProfiles() ) E1003 14:03:23.654186 91 helper.h:114] 3: [executionContext.cpp::setBindingDimensions::926] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setBindingDimensions::926, condition: mOptimizationProfile >= 0 && mOptimizationProfile < mEngine.getNbOptimizationProfiles() ) E1003 14:03:23.654201 91 helper.h:114] 3: [executionContext.cpp::setBindingDimensions::926] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setBindingDimensions::926, condition: mOptimizationProfile >= 0 && mOptimizationProfile < mEngine.getNbOptimizationProfiles() ) E1003 14:03:23.654212 91 helper.h:114] 3: [executionContext.cpp::setBindingDimensions::926] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setBindingDimensions::926, condition: mOptimizationProfile >= 0 && mOptimizationProfile < mEngine.getNbOptimizationProfiles() ) E1003 14:03:23.654222 91 helper.h:114] 3: [executionContext.cpp::setBindingDimensions::926] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setBindingDimensions::926, condition: mOptimizationProfile >= 0 && mOptimizationProfile < mEngine.getNbOptimizationProfiles() ) E1003 14:03:23.654234 91 helper.h:114] 3: [executionContext.cpp::setBindingDimensions::926] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setBindingDimensions::926, condition: mOptimizationProfile >= 0 && mOptimizationProfile < mEngine.getNbOptimizationProfiles() ) E1003 14:03:23.654249 91 helper.h:114] 3: [executionContext.cpp::getBindingDimensions::978] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::getBindingDimensions::978, condition: mOptimizationProfile >= 0 && mOptimizationProfile < mEngine.getNbOptimizationProfiles() ) E1003 14:03:23.654286 91 helper.h:114] 3: [executionContext.cpp::enqueueInternal::318] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::enqueueInternal::318, condition: mOptimizationProfile >= 0 && mOptimizationProfile < mEngine.getNbOptimizationProfiles() ) Signal (11) received. 0# 0x0000562C43FD3549 in /opt/tritonserver/bin/tritonserver 1# 0x00007F052C55D0C0 in /usr/lib/x86_64-linux-gnu/libc.so.6 2# 0x00007F05204E3928 in /opt/tritonserver/backends/paddle/libtriton_paddle.so 3# 0x00007F0520495579 in /opt/tritonserver/backends/paddle/libtriton_paddle.so 4# 0x00007F0520496775 in /opt/tritonserver/backends/paddle/libtriton_paddle.so 5# TRITONBACKEND_ModelInstanceExecute in /opt/tritonserver/backends/paddle/libtriton_paddle.so 6# 0x00007F052D10A07A in /opt/tritonserver/bin/../lib/libtritonserver.so 7# 0x00007F052D10A797 in /opt/tritonserver/bin/../lib/libtritonserver.so 8# 0x00007F052CF9D221 in /opt/tritonserver/bin/../lib/libtritonserver.so 9# 0x00007F052D104607 in /opt/tritonserver/bin/../lib/libtritonserver.so 10# 0x00007F052C94EDE4 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6 11# 0x00007F052CDCB609 in /usr/lib/x86_64-linux-gnu/libpthread.so.0 12# clone in /usr/lib/x86_64-linux-gnu/libc.so.6

triton server log

==================================
== Triton Inference Server Base ==
==================================

NVIDIA Release 22.03 (build 33743047)

Copyright (c) 2018-2022, NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

I1003 14:02:03.388966 1 onnxruntime.cc:2319] TRITONBACKEND_Initialize: onnxruntime
I1003 14:02:03.389124 1 onnxruntime.cc:2329] Triton TRITONBACKEND API version: 1.8
I1003 14:02:03.389140 1 onnxruntime.cc:2335] 'onnxruntime' TRITONBACKEND API version: 1.8
I1003 14:02:03.389151 1 onnxruntime.cc:2365] backend configuration:
{}
I1003 14:02:03.537700 1 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f04ee000000' with size 268435456
I1003 14:02:03.538462 1 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I1003 14:02:03.541414 1 model_repository_manager.cc:997] loading: ERNIE:1
I1003 14:02:03.642048 1 model_repository_manager.cc:997] loading: ResNet50-v1.5:1
I1003 14:02:03.781986 1 paddle.cc:1204] TRITONBACKEND_Initialize: paddle
I1003 14:02:03.782021 1 paddle.cc:1212] Triton TRITONBACKEND API version: 1.8
I1003 14:02:03.782028 1 paddle.cc:1219] 'paddle' TRITONBACKEND API version: 1.8
I1003 14:02:03.782032 1 paddle.cc:1249] backend configuration:
{}
I1003 14:02:03.782059 1 paddle.cc:1266] TRITONBACKEND_ModelInitialize: ERNIE (version 1)
I1003 14:02:03.783862 1 paddle.cc:1266] TRITONBACKEND_ModelInitialize: ResNet50-v1.5 (version 1)
I1003 14:02:03.784426 1 paddle.cc:1309] TRITONBACKEND_ModelInstanceInitialize: ERNIE_0 (GPU device 0)
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1003 14:02:03.819965    88 analysis_config.cc:1164] In CollectShapeInfo mode, we will disable optimizations and collect the shape information of all intermediate tensors in the compute graph and calculate the min_shape, max_shape and opt_shape.
I1003 14:02:03.835196    88 analysis_predictor.cc:1220] ir_optim is turned off, no IR pass will be executed
--- Running analysis [ir_graph_build_pass]
--- Running analysis [ir_graph_clean_pass]
--- Running analysis [ir_analysis_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [ir_graph_to_program_pass]
I1003 14:02:03.975621    88 analysis_predictor.cc:1274] ======= optimize end =======
I1003 14:02:03.978397    88 naive_executor.cc:110] ---  skip [feed], feed -> token_type_ids
I1003 14:02:03.978420    88 naive_executor.cc:110] ---  skip [feed], feed -> input_ids
I1003 14:02:03.980311    88 naive_executor.cc:110] ---  skip [linear_113.tmp_1], fetch -> fetch
I1003 14:02:12.677105    88 analysis_predictor.cc:1080] TensorRT subgraph engine is enabled
--- Running analysis [ir_graph_build_pass]
--- Running analysis [ir_graph_clean_pass]
--- Running analysis [ir_analysis_pass]
--- Running IR pass [identity_scale_op_clean_pass]
--- Running IR pass [adaptive_pool2d_convert_global_pass]
--- Running IR pass [shuffle_channel_detect_pass]
--- Running IR pass [quant_conv2d_dequant_fuse_pass]
--- Running IR pass [delete_fill_constant_op_pass]
--- Running IR pass [delete_quant_dequant_op_pass]
--- Running IR pass [delete_quant_dequant_filter_op_pass]
--- Running IR pass [delete_weight_dequant_linear_op_pass]
--- Running IR pass [delete_quant_dequant_linear_op_pass]
--- Running IR pass [add_support_int8_pass]
I1003 14:02:12.796942    88 fuse_pass_base.cc:59] ---  detected 220 subgraphs
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [trt_embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [preln_embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [delete_c_identity_op_pass]
--- Running IR pass [trt_multihead_matmul_fuse_pass_v2]
--- Running IR pass [trt_multihead_matmul_fuse_pass_v3]
I1003 14:02:12.937906    88 fuse_pass_base.cc:59] ---  detected 6 subgraphs
--- Running IR pass [vit_attention_fuse_pass]
--- Running IR pass [trt_skip_layernorm_fuse_pass]
I1003 14:02:12.947113    88 fuse_pass_base.cc:59] ---  detected 13 subgraphs
--- Running IR pass [preln_skip_layernorm_fuse_pass]
--- Running IR pass [preln_residual_bias_fuse_pass]
--- Running IR pass [conv_bn_fuse_pass]
--- Running IR pass [unsqueeze2_eltwise_fuse_pass]
--- Running IR pass [trt_squeeze2_matmul_fuse_pass]
--- Running IR pass [trt_reshape2_matmul_fuse_pass]
--- Running IR pass [trt_flatten2_matmul_fuse_pass]
--- Running IR pass [trt_map_matmul_v2_to_mul_pass]
I1003 14:02:12.951237    88 fuse_pass_base.cc:59] ---  detected 20 subgraphs
--- Running IR pass [trt_map_matmul_v2_to_matmul_pass]
--- Running IR pass [trt_map_matmul_to_mul_pass]
--- Running IR pass [fc_fuse_pass]
I1003 14:02:12.956024    88 fuse_pass_base.cc:59] ---  detected 20 subgraphs
--- Running IR pass [conv_elementwise_add_fuse_pass]
--- Running IR pass [remove_padding_recover_padding_pass]
--- Running IR pass [delete_remove_padding_recover_padding_pass]
--- Running IR pass [dense_fc_to_sparse_pass]
--- Running IR pass [dense_multihead_matmul_to_sparse_pass]
--- Running IR pass [tensorrt_subgraph_pass]
I1003 14:02:12.962479    88 tensorrt_subgraph_pass.cc:238] ---  detect a sub-graph with 51 nodes
I1003 14:02:12.976985    88 tensorrt_subgraph_pass.cc:541] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time.
I1003 14:02:13.822504    88 engine.cc:268] Run Paddle-TRT Dynamic Shape mode.
I1003 14:03:10.097728    88 engine.cc:680] ====== engine info ======
I1003 14:03:10.104418    88 engine.cc:685] Layers:
Scale: before_reshape (Output: tmp_312)
PWN(elementwise (Output: tmp_532), elementwise (Output: tmp_634))
Scale: scale (Output: tmp_312)
skip_layernorm (Output: layer_norm_26.tmp_249)
shuffle_before_multihead_mamul(Output: reshape2_3.tmp_0104)
scale (Output: tmp_312) + unsqueeze2 (Output: unsqueeze2_0.tmp_014)
Reformatting CopyNode for Input Tensor 0 to multihead_mamul_fc(Output: reshape2_3.tmp_0104)
multihead_mamul_fc(Output: reshape2_3.tmp_0104)
Reformatting CopyNode for Input Tensor 0 to multihead_matmul (Output: reshape2_3.tmp_0104)
multihead_matmul (Output: reshape2_3.tmp_0104)
fc_op_reshape_before_fc: Shuffle (Output: linear_79.tmp_1111)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_79.tmp_1111)
fc_op_float: FullyConnected (Output: linear_79.tmp_1111)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_79.tmp_1111)
shuffle_after_fc (Output: linear_79.tmp_1111)
skip_layernorm (Output: layer_norm_27.tmp_2122)
fc_op_reshape_before_fc: Shuffle (Output: linear_80.tmp_1128)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_80.tmp_1128)
fc_op_float: FullyConnected (Output: linear_80.tmp_1128)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_80.tmp_1128)
shuffle_after_fc (Output: linear_80.tmp_1128)
PWN(PWN(PWN(PWN(PWN((Unnamed Layer* 71) [Constant], (Unnamed Layer* 72) [ElementWise]), (Unnamed Layer* 73) [Unary]), PWN((Unnamed Layer* 69) [Constant], (Unnamed Layer* 74) [ElementWise])), PWN((Unnamed Layer* 70) [Constant], (Unnamed Layer* 75) [ElementWise])), gelu (Output: gelu_1.tmp_0130))
fc_op_reshape_before_fc: Shuffle (Output: linear_81.tmp_1136)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_81.tmp_1136)
fc_op_float: FullyConnected (Output: linear_81.tmp_1136)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_81.tmp_1136)
shuffle_after_fc (Output: linear_81.tmp_1136)
skip_layernorm (Output: layer_norm_28.tmp_2147)
shuffle_before_multihead_mamul(Output: reshape2_7.tmp_0199)
Reformatting CopyNode for Input Tensor 0 to multihead_mamul_fc(Output: reshape2_7.tmp_0199)
multihead_mamul_fc(Output: reshape2_7.tmp_0199)
Reformatting CopyNode for Input Tensor 0 to multihead_matmul (Output: reshape2_7.tmp_0199)
multihead_matmul (Output: reshape2_7.tmp_0199)
fc_op_reshape_before_fc: Shuffle (Output: linear_85.tmp_1206)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_85.tmp_1206)
fc_op_float: FullyConnected (Output: linear_85.tmp_1206)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_85.tmp_1206)
shuffle_after_fc (Output: linear_85.tmp_1206)
skip_layernorm (Output: layer_norm_29.tmp_2217)
fc_op_reshape_before_fc: Shuffle (Output: linear_86.tmp_1223)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_86.tmp_1223)
fc_op_float: FullyConnected (Output: linear_86.tmp_1223)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_86.tmp_1223)
shuffle_after_fc (Output: linear_86.tmp_1223)
PWN(PWN(PWN(PWN(PWN((Unnamed Layer* 157) [Constant], (Unnamed Layer* 158) [ElementWise]), (Unnamed Layer* 159) [Unary]), PWN((Unnamed Layer* 155) [Constant], (Unnamed Layer* 160) [ElementWise])), PWN((Unnamed Layer* 156) [Constant], (Unnamed Layer* 161) [ElementWise])), gelu (Output: gelu_2.tmp_0225))
fc_op_reshape_before_fc: Shuffle (Output: linear_87.tmp_1231)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_87.tmp_1231)
fc_op_float: FullyConnected (Output: linear_87.tmp_1231)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_87.tmp_1231)
shuffle_after_fc (Output: linear_87.tmp_1231)
skip_layernorm (Output: layer_norm_30.tmp_2242)
shuffle_before_multihead_mamul(Output: reshape2_11.tmp_0294)
Reformatting CopyNode for Input Tensor 0 to multihead_mamul_fc(Output: reshape2_11.tmp_0294)
multihead_mamul_fc(Output: reshape2_11.tmp_0294)
Reformatting CopyNode for Input Tensor 0 to multihead_matmul (Output: reshape2_11.tmp_0294)
multihead_matmul (Output: reshape2_11.tmp_0294)
fc_op_reshape_before_fc: Shuffle (Output: linear_91.tmp_1301)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_91.tmp_1301)
fc_op_float: FullyConnected (Output: linear_91.tmp_1301)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_91.tmp_1301)
shuffle_after_fc (Output: linear_91.tmp_1301)
skip_layernorm (Output: layer_norm_31.tmp_2312)
fc_op_reshape_before_fc: Shuffle (Output: linear_92.tmp_1318)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_92.tmp_1318)
fc_op_float: FullyConnected (Output: linear_92.tmp_1318)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_92.tmp_1318)
shuffle_after_fc (Output: linear_92.tmp_1318)
PWN(PWN(PWN(PWN(PWN((Unnamed Layer* 243) [Constant], (Unnamed Layer* 244) [ElementWise]), (Unnamed Layer* 245) [Unary]), PWN((Unnamed Layer* 241) [Constant], (Unnamed Layer* 246) [ElementWise])), PWN((Unnamed Layer* 242) [Constant], (Unnamed Layer* 247) [ElementWise])), gelu (Output: gelu_3.tmp_0320))
fc_op_reshape_before_fc: Shuffle (Output: linear_93.tmp_1326)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_93.tmp_1326)
fc_op_float: FullyConnected (Output: linear_93.tmp_1326)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_93.tmp_1326)
shuffle_after_fc (Output: linear_93.tmp_1326)
skip_layernorm (Output: layer_norm_32.tmp_2337)
shuffle_before_multihead_mamul(Output: reshape2_15.tmp_0389)
Reformatting CopyNode for Input Tensor 0 to multihead_mamul_fc(Output: reshape2_15.tmp_0389)
multihead_mamul_fc(Output: reshape2_15.tmp_0389)
Reformatting CopyNode for Input Tensor 0 to multihead_matmul (Output: reshape2_15.tmp_0389)
multihead_matmul (Output: reshape2_15.tmp_0389)
fc_op_reshape_before_fc: Shuffle (Output: linear_97.tmp_1396)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_97.tmp_1396)
fc_op_float: FullyConnected (Output: linear_97.tmp_1396)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_97.tmp_1396)
shuffle_after_fc (Output: linear_97.tmp_1396)
skip_layernorm (Output: layer_norm_33.tmp_2407)
fc_op_reshape_before_fc: Shuffle (Output: linear_98.tmp_1413)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_98.tmp_1413)
fc_op_float: FullyConnected (Output: linear_98.tmp_1413)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_98.tmp_1413)
shuffle_after_fc (Output: linear_98.tmp_1413)
PWN(PWN(PWN(PWN(PWN((Unnamed Layer* 329) [Constant], (Unnamed Layer* 330) [ElementWise]), (Unnamed Layer* 331) [Unary]), PWN((Unnamed Layer* 327) [Constant], (Unnamed Layer* 332) [ElementWise])), PWN((Unnamed Layer* 328) [Constant], (Unnamed Layer* 333) [ElementWise])), gelu (Output: gelu_4.tmp_0415))
fc_op_reshape_before_fc: Shuffle (Output: linear_99.tmp_1421)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_99.tmp_1421)
fc_op_float: FullyConnected (Output: linear_99.tmp_1421)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_99.tmp_1421)
shuffle_after_fc (Output: linear_99.tmp_1421)
skip_layernorm (Output: layer_norm_34.tmp_2432)
shuffle_before_multihead_mamul(Output: reshape2_19.tmp_0484)
Reformatting CopyNode for Input Tensor 0 to multihead_mamul_fc(Output: reshape2_19.tmp_0484)
multihead_mamul_fc(Output: reshape2_19.tmp_0484)
Reformatting CopyNode for Input Tensor 0 to multihead_matmul (Output: reshape2_19.tmp_0484)
multihead_matmul (Output: reshape2_19.tmp_0484)
fc_op_reshape_before_fc: Shuffle (Output: linear_103.tmp_1491)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_103.tmp_1491)
fc_op_float: FullyConnected (Output: linear_103.tmp_1491)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_103.tmp_1491)
shuffle_after_fc (Output: linear_103.tmp_1491)
skip_layernorm (Output: layer_norm_35.tmp_2502)
fc_op_reshape_before_fc: Shuffle (Output: linear_104.tmp_1508)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_104.tmp_1508)
fc_op_float: FullyConnected (Output: linear_104.tmp_1508)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_104.tmp_1508)
shuffle_after_fc (Output: linear_104.tmp_1508)
PWN(PWN(PWN(PWN(PWN((Unnamed Layer* 415) [Constant], (Unnamed Layer* 416) [ElementWise]), (Unnamed Layer* 417) [Unary]), PWN((Unnamed Layer* 413) [Constant], (Unnamed Layer* 418) [ElementWise])), PWN((Unnamed Layer* 414) [Constant], (Unnamed Layer* 419) [ElementWise])), gelu (Output: gelu_5.tmp_0510))
fc_op_reshape_before_fc: Shuffle (Output: linear_105.tmp_1516)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_105.tmp_1516)
fc_op_float: FullyConnected (Output: linear_105.tmp_1516)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_105.tmp_1516)
shuffle_after_fc (Output: linear_105.tmp_1516)
skip_layernorm (Output: layer_norm_36.tmp_2527)
shuffle_before_multihead_mamul(Output: reshape2_23.tmp_0579)
Reformatting CopyNode for Input Tensor 0 to multihead_mamul_fc(Output: reshape2_23.tmp_0579)
multihead_mamul_fc(Output: reshape2_23.tmp_0579)
Reformatting CopyNode for Input Tensor 0 to multihead_matmul (Output: reshape2_23.tmp_0579)
multihead_matmul (Output: reshape2_23.tmp_0579)
fc_op_reshape_before_fc: Shuffle (Output: linear_109.tmp_1586)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_109.tmp_1586)
fc_op_float: FullyConnected (Output: linear_109.tmp_1586)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_109.tmp_1586)
shuffle_after_fc (Output: linear_109.tmp_1586)
skip_layernorm (Output: layer_norm_37.tmp_2597)
fc_op_reshape_before_fc: Shuffle (Output: linear_110.tmp_1603)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_110.tmp_1603)
fc_op_float: FullyConnected (Output: linear_110.tmp_1603)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_110.tmp_1603)
shuffle_after_fc (Output: linear_110.tmp_1603)
PWN(PWN(PWN(PWN(PWN((Unnamed Layer* 501) [Constant], (Unnamed Layer* 502) [ElementWise]), (Unnamed Layer* 503) [Unary]), PWN((Unnamed Layer* 499) [Constant], (Unnamed Layer* 504) [ElementWise])), PWN((Unnamed Layer* 500) [Constant], (Unnamed Layer* 505) [ElementWise])), gelu (Output: gelu_6.tmp_0605))
fc_op_reshape_before_fc: Shuffle (Output: linear_111.tmp_1611)
Reformatting CopyNode for Input Tensor 0 to fc_op_float: FullyConnected (Output: linear_111.tmp_1611)
fc_op_float: FullyConnected (Output: linear_111.tmp_1611)
Reformatting CopyNode for Input Tensor 0 to shuffle_after_fc (Output: linear_111.tmp_1611)
shuffle_after_fc (Output: linear_111.tmp_1611)
skip_layernorm (Output: layer_norm_38.tmp_2622)
slice (Output: layer_norm_38.tmp_2_slice_0624) + fc_op_reshape_before_fc: Shuffle (Output: linear_112.tmp_1630)
fc_op_float: FullyConnected (Output: linear_112.tmp_1630)
PWN(tanh (Output: tanh_3.tmp_0632))
fc_op_float: FullyConnected (Output: linear_113.tmp_1641)
shuffle_after_fc (Output: linear_113.tmp_1641)

Bindings:
embedding_10.tmp_0
embedding_11.tmp_0
embedding_8.tmp_0
embedding_9.tmp_0
tmp_2
linear_113.tmp_1641
I1003 14:03:10.104538    88 engine.cc:687] ====== engine info end ======
--- Running IR pass [conv_bn_fuse_pass]
--- Running IR pass [conv_elementwise_add_act_fuse_pass]
--- Running IR pass [conv_elementwise_add2_act_fuse_pass]
--- Running IR pass [transpose_flatten_concat_fuse_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
I1003 14:03:10.112846    88 ir_params_sync_among_devices_pass.cc:88] Sync params from CPU to GPU
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [memory_optimize_pass]
I1003 14:03:10.162847    88 memory_optimize_pass.cc:218] Cluster name : full_like_0.tmp_0  size: 8
I1003 14:03:10.162859    88 memory_optimize_pass.cc:218] Cluster name : tmp_4  size: 8
I1003 14:03:10.162863    88 memory_optimize_pass.cc:218] Cluster name : cumsum_0.tmp_0  size: 8
I1003 14:03:10.162864    88 memory_optimize_pass.cc:218] Cluster name : token_type_ids  size: 8
--- Running analysis [ir_graph_to_program_pass]
I1003 14:03:10.183755    88 analysis_predictor.cc:1274] ======= optimize end =======
I1003 14:03:10.188370    88 naive_executor.cc:110] ---  skip [feed], feed -> token_type_ids
I1003 14:03:10.188387    88 naive_executor.cc:110] ---  skip [feed], feed -> input_ids
I1003 14:03:10.188688    88 naive_executor.cc:110] ---  skip [linear_113.tmp_1], fetch -> fetch
W1003 14:03:10.188714    88 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.6, Runtime API Version: 11.6
W1003 14:03:10.188877    88 gpu_resources.cc:91] device: 0, cuDNN Version: 8.3.
I1003 14:03:10.188996 1 paddle.cc:1309] TRITONBACKEND_ModelInstanceInitialize: ResNet50-v1.5_0 (GPU device 0)
I1003 14:03:10.196445 1 model_repository_manager.cc:1152] successfully loaded 'ERNIE' version 1
--- Running analysis [ir_graph_build_pass]
--- Running analysis [ir_graph_clean_pass]
--- Running analysis [ir_analysis_pass]
--- Running IR pass [is_test_pass]
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [conv_bn_fuse_pass]
--- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [trt_skip_layernorm_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v2]
--- Running IR pass [gpu_cpu_squeeze2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_reshape2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_flatten2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_map_matmul_v2_to_mul_pass]
--- Running IR pass [gpu_cpu_map_matmul_v2_to_matmul_pass]
I1003 14:03:10.288832    89 fuse_pass_base.cc:59] ---  detected 1 subgraphs
--- Running IR pass [matmul_scale_fuse_pass]
I1003 14:03:10.289359    89 fuse_pass_base.cc:59] ---  detected 1 subgraphs
--- Running IR pass [multihead_matmul_fuse_pass_v3]
--- Running IR pass [gpu_cpu_map_matmul_to_mul_pass]
--- Running IR pass [fc_fuse_pass]
--- Running IR pass [fc_elementwise_layernorm_fuse_pass]
--- Running IR pass [conv_elementwise_add_act_fuse_pass]
--- Running IR pass [conv_elementwise_add2_act_fuse_pass]
--- Running IR pass [conv_elementwise_add_fuse_pass]
I1003 14:03:10.335124    89 fuse_pass_base.cc:59] ---  detected 16 subgraphs
--- Running IR pass [transpose_flatten_concat_fuse_pass]
--- Running IR pass [runtime_context_cache_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
I1003 14:03:10.338466    89 ir_params_sync_among_devices_pass.cc:88] Sync params from CPU to GPU
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [memory_optimize_pass]
I1003 14:03:10.395090    89 memory_optimize_pass.cc:218] Cluster name : fill_constant_1.tmp_0  size: 8
I1003 14:03:10.395102    89 memory_optimize_pass.cc:218] Cluster name : x0  size: 602112
I1003 14:03:10.395103    89 memory_optimize_pass.cc:218] Cluster name : elementwise_add_4  size: 1605632
I1003 14:03:10.395107    89 memory_optimize_pass.cc:218] Cluster name : conv2d_63.tmp_1  size: 3211264
I1003 14:03:10.395110    89 memory_optimize_pass.cc:218] Cluster name : elementwise_add_2  size: 3211264
I1003 14:03:10.395112    89 memory_optimize_pass.cc:218] Cluster name : conv2d_60.tmp_1  size: 3211264
--- Running analysis [ir_graph_to_program_pass]
I1003 14:03:10.422075    89 analysis_predictor.cc:1274] ======= optimize end =======
I1003 14:03:10.422649    89 naive_executor.cc:110] ---  skip [feed], feed -> x0
I1003 14:03:10.424878    89 naive_executor.cc:110] ---  skip [save_infer_model/scale_0.tmp_1], fetch -> fetch
I1003 14:03:10.425115 1 model_repository_manager.cc:1152] successfully loaded 'ResNet50-v1.5' version 1
I1003 14:03:10.425218 1 server.cc:524] 
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I1003 14:03:10.425272 1 server.cc:551] 
+-------------+-----------------------------------------------------------------+--------+
| Backend     | Path                                                            | Config |
+-------------+-----------------------------------------------------------------+--------+
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {}     |
| paddle      | /opt/tritonserver/backends/paddle/libtriton_paddle.so           | {}     |
+-------------+-----------------------------------------------------------------+--------+

I1003 14:03:10.425306 1 server.cc:594] 
+---------------+---------+--------+
| Model         | Version | Status |
+---------------+---------+--------+
| ERNIE         | 1       | READY  |
| ResNet50-v1.5 | 1       | READY  |
+---------------+---------+--------+

I1003 14:03:10.469245 1 metrics.cc:651] Collecting metrics for GPU 0: NVIDIA GeForce RTX 3070 Laptop GPU
I1003 14:03:10.469525 1 tritonserver.cc:1962] 
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                        |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                       |
| server_version                   | 2.20.0                                                                                                                                                                                       |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0]         | /workspace/models                                                                                                                                                                            |
| model_control_mode               | MODE_NONE                                                                                                                                                                                    |
| strict_model_config              | 1                                                                                                                                                                                            |
| rate_limit                       | OFF                                                                                                                                                                                          |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                                    |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                                                     |
| response_cache_byte_size         | 0                                                                                                                                                                                            |
| min_supported_compute_capability | 6.0                                                                                                                                                                                          |
| strict_readiness                 | 1                                                                                                                                                                                            |
| exit_timeout                     | 30                                                                                                                                                                                           |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I1003 14:03:10.474583 1 grpc_server.cc:4421] Started GRPCInferenceService at 0.0.0.0:8001
I1003 14:03:10.475297 1 http_server.cc:3113] Started HTTPService at 0.0.0.0:8000
I1003 14:03:10.516651 1 http_server.cc:178] Started Metrics Service at 0.0.0.0:8002
W1003 14:03:11.474144 1 metrics.cc:427] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W1003 14:03:12.475025 1 metrics.cc:427] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W1003 14:03:13.477496 1 metrics.cc:427] Unable to get power limit for GPU 0. Status:Success, value:0.000000

condition1

ERROR:

Scanning dependencies of target paddle_inference_c Scanning dependencies of target paddle_inference_c_shared [100%] Building CXX object paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c.dir/pd_predictor.cc.o [100%] Building CXX object paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c.dir/pd_tensor.cc.o [100%] Building CXX object paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c.dir/pd_config.cc.o [100%] Building CXX object paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c.dir/pd_utils.cc.o [100%] Building CXX object paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c_shared.dir/pd_tensor.cc.o [100%] Building CXX object paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c_shared.dir/pd_predictor.cc.o [100%] Building CXX object paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o [100%] Linking CXX shared library libpaddle_inference.so [100%] Building CXX object paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c_shared.dir/pd_utils.cc.o [100%] Linking CXX static library libpaddle_inference_c.a [100%] Built target paddle_inference_c [100%] Linking CXX shared library libpaddle_inference_c.so /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/crti.o: in function `_init': (.init+0xb): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against undefined symbol `__gmon_start__' CMakeFiles/paddle_inference_shared.dir/io.cc.o: in function `paddle::safe_realloc(void*, unsigned long) [clone .part.42]': io.cc:(.text+0x11): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vtable for std::bad_alloc@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/8/libstdc++.so io.cc:(.text+0x18): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `typeinfo for std::bad_alloc@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/8/libstdc++.so io.cc:(.text+0x29): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `std::bad_alloc::~bad_alloc()@@GLIBCXX_3.4' defined in .text section in /usr/lib/gcc/x86_64-linux-gnu/8/libstdc++.so CMakeFiles/paddle_inference_shared.dir/io.cc.o: in function `paddle::report_at_maximum_capacity(unsigned long)': io.cc:(.text+0x13ba): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vsnprintf@@GLIBC_2.2.5' defined in .text section in /lib/x86_64-linux-gnu/libc.so.6 CMakeFiles/paddle_inference_shared.dir/io.cc.o: in function `paddle::report_size_overflow(unsigned long, unsigned long) [clone .constprop.592]': io.cc:(.text+0x146a): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vsnprintf@@GLIBC_2.2.5' defined in .text section in /lib/x86_64-linux-gnu/libc.so.6 CMakeFiles/paddle_inference_shared.dir/io.cc.o: in function `paddle::inference::ReadBinaryFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*)': io.cc:(.text+0x178e): relocation truncated to fit: R_X86_64_GOTPCREL against symbol `void paddle::string::tinyformat::detail::FormatArg::formatImpl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::ostream&, char const*, char const*, int, void const*)' defined in .text._ZN6paddle6string10tinyformat6detail9FormatArg10formatImplINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRSoPKcSD_iPKv[_ZN6paddle6string10tinyformat6detail9FormatArg10formatImplINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRSoPKcSD_iPKv] section in CMakeFiles/paddle_inference_shared.dir/io.cc.o io.cc:(.text+0x179f): relocation truncated to fit: R_X86_64_GOTPCREL against symbol `int paddle::string::tinyformat::detail::FormatArg::toIntImpl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(void const*)' defined in .text._ZN6paddle6string10tinyformat6detail9FormatArg9toIntImplINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEiPKv[_ZN6paddle6string10tinyformat6detail9FormatArg9toIntImplINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEiPKv] section in CMakeFiles/paddle_inference_shared.dir/io.cc.o io.cc:(.text+0x18aa): relocation truncated to fit: R_X86_64_GOTPCREL against symbol `void paddle::string::tinyformat::detail::FormatArg::formatImpl<char [14]>(std::ostream&, char const*, char const*, int, void const*)' defined in .text._ZN6paddle6string10tinyformat6detail9FormatArg10formatImplIA14_cEEvRSoPKcS8_iPKv[_ZN6paddle6string10tinyformat6detail9FormatArg10formatImplIA14_cEEvRSoPKcS8_iPKv] section in CMakeFiles/paddle_inference_shared.dir/io.cc.o io.cc:(.text+0x18b7): relocation truncated to fit: R_X86_64_GOTPCREL against symbol `int paddle::string::tinyformat::detail::FormatArg::toIntImpl<char [14]>(void const*)' defined in .text._ZN6paddle6string10tinyformat6detail9FormatArg9toIntImplIA14_cEEiPKv[_ZN6paddle6string10tinyformat6detail9FormatArg9toIntImplIA14_cEEiPKv] section in CMakeFiles/paddle_inference_shared.dir/io.cc.o io.cc:(.text+0x18e8): additional relocation overflows omitted from the output /usr/bin/ld: failed to convert GOTPCREL relocation; relink with --no-relax collect2: error: ld returned 1 exit status make[2]: *** [paddle/fluid/inference/CMakeFiles/paddle_inference_shared.dir/build.make:2244: paddle/fluid/inference/libpaddle_inference.so] Error 1 make[1]: *** [CMakeFiles/Makefile2:163108: paddle/fluid/inference/CMakeFiles/paddle_inference_shared.dir/all] Error 2 make[1]: *** Waiting for unfinished jobs.... /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/crti.o: in function `_init': (.init+0xb): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against undefined symbol `__gmon_start__' CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o: in function `paddle::safe_realloc(void*, unsigned long) [clone .part.67]': pd_config.cc:(.text+0x11): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vtable for std::bad_alloc@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/8/libstdc++.so pd_config.cc:(.text+0x18): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `typeinfo for std::bad_alloc@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/8/libstdc++.so pd_config.cc:(.text+0x29): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `std::bad_alloc::~bad_alloc()@@GLIBCXX_3.4' defined in .text section in /usr/lib/gcc/x86_64-linux-gnu/8/libstdc++.so CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o: in function `paddle::report_at_maximum_capacity(unsigned long)': pd_config.cc:(.text+0x22a): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vsnprintf@@GLIBC_2.2.5' defined in .text section in /lib/x86_64-linux-gnu/libc.so.6 CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o: in function `paddle::report_size_overflow(unsigned long, unsigned long) [clone .constprop.300]': pd_config.cc:(.text+0x2da): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vsnprintf@@GLIBC_2.2.5' defined in .text section in /lib/x86_64-linux-gnu/libc.so.6 CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o: in function `PD_ConfigDestroy': pd_config.cc:(.text+0x808e): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `__pthread_key_create@@GLIBC_2.2.5' defined in .text section in /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/libpthread.so pd_config.cc:(.text+0x80b7): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `std::_Sp_counted_ptr<decltype(nullptr), (__gnu_cxx::_Lock_policy)2>::_M_dispose()' defined in .text._ZNSt15_Sp_counted_ptrIDnLN9__gnu_cxx12_Lock_policyE2EE10_M_disposeEv[_ZNSt15_Sp_counted_ptrIDnLN9__gnu_cxx12_Lock_policyE2EE10_M_disposeEv] section in CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o pd_config.cc:(.text+0x80e9): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_destroy()' defined in .text._ZNSt16_Sp_counted_baseILN9__gnu_cxx12_Lock_policyE2EE10_M_destroyEv[_ZNSt16_Sp_counted_baseILN9__gnu_cxx12_Lock_policyE2EE10_M_destroyEv] section in CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o: in function `phi::enforce::EnforceNotMet::what() const': pd_config.cc:(.text._ZNK3phi7enforce13EnforceNotMet4whatEv[_ZNK3phi7enforce13EnforceNotMet4whatEv]+0x7): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `fLI::FLAGS_call_stack_level' defined in .data section in ../libpaddle_inference.a(flags.cc.o) CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o: in function `phi::enforce::EnforceNotMet::~EnforceNotMet()': pd_config.cc:(.text._ZN3phi7enforce13EnforceNotMetD2Ev[_ZN3phi7enforce13EnforceNotMetD5Ev]+0x13): additional relocation overflows omitted from the output libpaddle_inference_c.so: PC-relative offset overflow in PLT entry for `_ZN3phi5funcs21LaunchBroadcastKernelINS_5dtype7float16ES3_NS_3kps13DivideFunctorIS3_fEELi1ELi1ELi4EEEvRKNS_10GPUContextERKSt6vectorIPKNS_11DenseTensorESaISD_EEPSA_IPSB_SaISI_EET1_RKNS_5ArrayINS4_7details15BroadcastConfigEXT2_EEE' collect2: error: ld returned 1 exit status make[2]: *** [paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c_shared.dir/build.make:1204: paddle/fluid/inference/capi_exp/libpaddle_inference_c.so] Error 1 make[1]: *** [CMakeFiles/Makefile2:177249: paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c_shared.dir/all] Error 2 make: *** [Makefile:130: all] Error 2 The command '/bin/sh -c python3 -m pip install pyyaml -i https://pypi.tuna.tsinghua.edu.cn/simple && mkdir build-env && cd build-env && cmake .. -DWITH_PYTHON=OFF -DWITH_GPU=ON -DWITH_TESTING=OFF -DWITH_INFERENCE_API_TEST=OFF -DCMAKE_BUILD_TYPE=Release -DCUDA_ARCH_NAME=Auto -DON_INFER=ON -DWITH_MKL=ON -DWITH_TENSORRT=ON -DWITH_ONNXRUNTIME=ON && make -j8' returned a non-zero code: 2

FROM nvcr.io/nvidia/tritonserver:22.03-py3 ENV DEBIAN_FRONTEND=noninteractive RUN apt-key del 7fa2af80 \ && wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.0-1_all.deb \ && dpkg -i cuda-keyring_1.0-1_all.deb RUN apt-get update \ && apt-get install -y --no-install-recommends \ cmake \ patchelf \ python3-dev \ unzip \ gcc-8 \ g++-8 \ libgl1 \ libssl-dev RUN update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-8 100 RUN update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-8 100 RUN git clone 'https://github.com/PaddlePaddle/Paddle.git' WORKDIR /opt/tritonserver/Paddle RUN git pull && git checkout release/2.4 RUN python3 -m pip install pyyaml && mkdir build-env && \ cd build-env && \ cmake .. -DWITH_PYTHON=OFF \ -DWITH_GPU=ON \ -DWITH_TESTING=OFF \ -DWITH_INFERENCE_API_TEST=OFF \ -DCMAKE_BUILD_TYPE=Release \ -DCUDA_ARCH_NAME=Auto \ -DON_INFER=ON \ -DWITH_MKL=ON \ -DWITH_TENSORRT=ON \ -DWITH_ONNXRUNTIME=ON && \ make -j`nproc`

condition2

another dockerfile record ERROR:

[ 86%] Building CXX object CMakeFiles/libprotoc.dir/opt/tritonserver/Paddle/build-env/third_party/protobuf/src/extern_protobuf/src/google/protobuf/compiler/javanano/javanano_helpers.cc.o [ 6%] Building CUDA object paddle/phi/kernels/funcs/eigen/CMakeFiles/eigen_function.dir/broadcast.cu.o [ 86%] Building CXX object CMakeFiles/libprotoc.dir/opt/tritonserver/Paddle/build-env/third_party/protobuf/src/extern_protobuf/src/google/protobuf/compiler/javanano/javanano_map_field.cc.o /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512fintrin.h(9146): error: identifier "__builtin_ia32_rndscaless_round" is undefined /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512fintrin.h(9155): error: identifier "__builtin_ia32_rndscalesd_round" is undefined /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512fintrin.h(14797): error: identifier "__builtin_ia32_rndscaless_round" is undefined /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512fintrin.h(14806): error: identifier "__builtin_ia32_rndscalesd_round" is undefined /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512dqintrin.h(1365): error: identifier "__builtin_ia32_fpclassss" is undefined /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512dqintrin.h(1372): error: identifier "__builtin_ia32_fpclasssd" is undefined [ 87%] Building CXX object CMakeFiles/libprotoc.dir/opt/tritonserver/Paddle/build-env/third_party/protobuf/src/extern_protobuf/src/google/protobuf/compiler/javanano/javanano_message.cc.o [ 87%] Building CXX object CMakeFiles/libprotoc.dir/opt/tritonserver/Paddle/build-env/third_party/protobuf/src/extern_protobuf/src/google/protobuf/compiler/javanano/javanano_message_field.cc.o [ 88%] Building CXX object CMakeFiles/libprotoc.dir/opt/tritonserver/Paddle/build-env/third_party/protobuf/src/extern_protobuf/src/google/protobuf/compiler/javanano/javanano_primitive_field.cc.o [ 88%] Building CXX object CMakeFiles/libprotoc.dir/opt/tritonserver/Paddle/build-env/third_party/protobuf/src/extern_protobuf/src/google/protobuf/compiler/js/js_generator.cc.o ... ... ... [ 94%] Building CXX object CMakeFiles/libprotoc.dir/opt/tritonserver/Paddle/build-env/third_party/protobuf/src/extern_protobuf/src/google/protobuf/compiler/objectivec/objectivec_oneof.cc.o [ 94%] Building CXX object CMakeFiles/libprotoc.dir/opt/tritonserver/Paddle/build-env/third_party/protobuf/src/extern_protobuf/src/google/protobuf/compiler/objectivec/objectivec_primitive_field.cc.o 6 errors detected in the compilation of "/opt/tritonserver/Paddle/paddle/phi/kernels/funcs/eigen/broadcast.cu". make[2]: *** [paddle/phi/kernels/funcs/eigen/CMakeFiles/eigen_function.dir/build.make:206: paddle/phi/kernels/funcs/eigen/CMakeFiles/eigen_function.dir/broadcast.cu.o] Error 1 make[2]: *** Waiting for unfinished jobs.... [ 95%] Building CXX object CMakeFiles/libprotoc.dir/opt/tritonserver/Paddle/build-env/third_party/protobuf/src/extern_protobuf/src/google/protobuf/compiler/php/php_generator.cc.o [ 95%] Building CXX object CMakeFiles/libprotoc.dir/opt/tritonserver/Paddle/build-env/third_party/protobuf/src/extern_protobuf/src/google/protobuf/compiler/plugin.cc.o

FROM nvcr.io/nvidia/tritonserver:22.03-py3 ENV DEBIAN_FRONTEND=noninteractive RUN apt-key del 7fa2af80 \ && wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.0-1_all.deb \ && dpkg -i cuda-keyring_1.0-1_all.deb RUN apt-get update \ && apt-get install -y --no-install-recommends \ cmake \ patchelf \ python3-dev \ unzip \ gcc-8 \ g++-8 \ libgl1 \ libssl-dev RUN git clone 'https://github.com/PaddlePaddle/Paddle.git' WORKDIR /opt/tritonserver/Paddle RUN git pull && git checkout release/2.4 RUN python3 -m pip install pyyaml -i https://pypi.tuna.tsinghua.edu.cn/simple && mkdir build-env && \ cd build-env && \ cmake .. -DWITH_PYTHON=OFF \ -DWITH_GPU=ON \ -DWITH_TESTING=OFF \ -DWITH_INFERENCE_API_TEST=OFF \ -DCMAKE_BUILD_TYPE=Release \ -DCUDA_ARCH_NAME=Auto \ -DON_INFER=ON \ -DWITH_MKL=ON \ -DWITH_TENSORRT=ON \ -DWITH_ONNXRUNTIME=ON \ -DCMAKE_C_COMPILER=`which gcc-8` -DCMAKE_CXX_COMPILER=`which g++-8` && \ make -j8

condition3

ERROR:

[100%] Built target paddle_inference Scanning dependencies of target paddle_inference_c Scanning dependencies of target paddle_inference_c_shared [100%] Building CXX object paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c.dir/pd_config.cc.o [100%] Building CXX object paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c.dir/pd_tensor.cc.o [100%] Building CXX object paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c.dir/pd_utils.cc.o [100%] Building CXX object paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c.dir/pd_predictor.cc.o [100%] Building CXX object paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o [100%] Building CXX object paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c_shared.dir/pd_predictor.cc.o [100%] Building CXX object paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c_shared.dir/pd_tensor.cc.o [100%] Building CXX object paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c_shared.dir/pd_utils.cc.o [100%] Linking CXX static library libpaddle_inference_c.a [100%] Built target paddle_inference_c [100%] Linking CXX shared library libpaddle_inference_c.so /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/crti.o: in function `_init': (.init+0xb): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against undefined symbol `__gmon_start__' CMakeFiles/paddle_inference_shared.dir/io.cc.o: in function `paddle::safe_realloc(void*, unsigned long) [clone .part.42]': io.cc:(.text+0x11): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vtable for std::bad_alloc@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/8/libstdc++.so io.cc:(.text+0x18): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `typeinfo for std::bad_alloc@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/8/libstdc++.so io.cc:(.text+0x29): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `std::bad_alloc::~bad_alloc()@@GLIBCXX_3.4' defined in .text section in /usr/lib/gcc/x86_64-linux-gnu/8/libstdc++.so CMakeFiles/paddle_inference_shared.dir/io.cc.o: in function `paddle::report_at_maximum_capacity(unsigned long)': io.cc:(.text+0x13ba): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vsnprintf@@GLIBC_2.2.5' defined in .text section in /lib/x86_64-linux-gnu/libc.so.6 CMakeFiles/paddle_inference_shared.dir/io.cc.o: in function `paddle::report_size_overflow(unsigned long, unsigned long) [clone .constprop.592]': io.cc:(.text+0x146a): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vsnprintf@@GLIBC_2.2.5' defined in .text section in /lib/x86_64-linux-gnu/libc.so.6 CMakeFiles/paddle_inference_shared.dir/io.cc.o: in function `paddle::inference::ReadBinaryFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*)': io.cc:(.text+0x178e): relocation truncated to fit: R_X86_64_GOTPCREL against symbol `void paddle::string::tinyformat::detail::FormatArg::formatImpl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::ostream&, char const*, char const*, int, void const*)' defined in .text._ZN6paddle6string10tinyformat6detail9FormatArg10formatImplINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRSoPKcSD_iPKv[_ZN6paddle6string10tinyformat6detail9FormatArg10formatImplINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRSoPKcSD_iPKv] section in CMakeFiles/paddle_inference_shared.dir/io.cc.o io.cc:(.text+0x179f): relocation truncated to fit: R_X86_64_GOTPCREL against symbol `int paddle::string::tinyformat::detail::FormatArg::toIntImpl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(void const*)' defined in .text._ZN6paddle6string10tinyformat6detail9FormatArg9toIntImplINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEiPKv[_ZN6paddle6string10tinyformat6detail9FormatArg9toIntImplINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEiPKv] section in CMakeFiles/paddle_inference_shared.dir/io.cc.o io.cc:(.text+0x18aa): relocation truncated to fit: R_X86_64_GOTPCREL against symbol `void paddle::string::tinyformat::detail::FormatArg::formatImpl<char [14]>(std::ostream&, char const*, char const*, int, void const*)' defined in .text._ZN6paddle6string10tinyformat6detail9FormatArg10formatImplIA14_cEEvRSoPKcS8_iPKv[_ZN6paddle6string10tinyformat6detail9FormatArg10formatImplIA14_cEEvRSoPKcS8_iPKv] section in CMakeFiles/paddle_inference_shared.dir/io.cc.o io.cc:(.text+0x18b7): relocation truncated to fit: R_X86_64_GOTPCREL against symbol `int paddle::string::tinyformat::detail::FormatArg::toIntImpl<char [14]>(void const*)' defined in .text._ZN6paddle6string10tinyformat6detail9FormatArg9toIntImplIA14_cEEiPKv[_ZN6paddle6string10tinyformat6detail9FormatArg9toIntImplIA14_cEEiPKv] section in CMakeFiles/paddle_inference_shared.dir/io.cc.o io.cc:(.text+0x18e8): additional relocation overflows omitted from the output /usr/bin/ld: failed to convert GOTPCREL relocation; relink with --no-relax collect2: error: ld returned 1 exit status make[2]: *** [paddle/fluid/inference/CMakeFiles/paddle_inference_shared.dir/build.make:2244: paddle/fluid/inference/libpaddle_inference.so] Error 1 make[1]: *** [CMakeFiles/Makefile2:163108: paddle/fluid/inference/CMakeFiles/paddle_inference_shared.dir/all] Error 2 make[1]: *** Waiting for unfinished jobs.... /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/crti.o: in function `_init': (.init+0xb): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against undefined symbol `__gmon_start__' CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o: in function `paddle::safe_realloc(void*, unsigned long) [clone .part.67]': pd_config.cc:(.text+0x11): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vtable for std::bad_alloc@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/8/libstdc++.so pd_config.cc:(.text+0x18): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `typeinfo for std::bad_alloc@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/8/libstdc++.so pd_config.cc:(.text+0x29): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `std::bad_alloc::~bad_alloc()@@GLIBCXX_3.4' defined in .text section in /usr/lib/gcc/x86_64-linux-gnu/8/libstdc++.so CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o: in function `paddle::report_at_maximum_capacity(unsigned long)': pd_config.cc:(.text+0x22a): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vsnprintf@@GLIBC_2.2.5' defined in .text section in /lib/x86_64-linux-gnu/libc.so.6 CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o: in function `paddle::report_size_overflow(unsigned long, unsigned long) [clone .constprop.300]': pd_config.cc:(.text+0x2da): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vsnprintf@@GLIBC_2.2.5' defined in .text section in /lib/x86_64-linux-gnu/libc.so.6 CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o: in function `PD_ConfigDestroy': pd_config.cc:(.text+0x808e): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `__pthread_key_create@@GLIBC_2.2.5' defined in .text section in /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/libpthread.so pd_config.cc:(.text+0x80b7): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `std::_Sp_counted_ptr<decltype(nullptr), (__gnu_cxx::_Lock_policy)2>::_M_dispose()' defined in .text._ZNSt15_Sp_counted_ptrIDnLN9__gnu_cxx12_Lock_policyE2EE10_M_disposeEv[_ZNSt15_Sp_counted_ptrIDnLN9__gnu_cxx12_Lock_policyE2EE10_M_disposeEv] section in CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o pd_config.cc:(.text+0x80e9): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_destroy()' defined in .text._ZNSt16_Sp_counted_baseILN9__gnu_cxx12_Lock_policyE2EE10_M_destroyEv[_ZNSt16_Sp_counted_baseILN9__gnu_cxx12_Lock_policyE2EE10_M_destroyEv] section in CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o: in function `phi::enforce::EnforceNotMet::what() const': pd_config.cc:(.text._ZNK3phi7enforce13EnforceNotMet4whatEv[_ZNK3phi7enforce13EnforceNotMet4whatEv]+0x7): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `fLI::FLAGS_call_stack_level' defined in .data section in ../libpaddle_inference.a(flags.cc.o) CMakeFiles/paddle_inference_c_shared.dir/pd_config.cc.o: in function `phi::enforce::EnforceNotMet::~EnforceNotMet()': pd_config.cc:(.text._ZN3phi7enforce13EnforceNotMetD2Ev[_ZN3phi7enforce13EnforceNotMetD5Ev]+0x13): additional relocation overflows omitted from the output libpaddle_inference_c.so: PC-relative offset overflow in PLT entry for `_ZN3phi5funcs21LaunchBroadcastKernelINS_5dtype7float16ES3_NS_3kps13DivideFunctorIS3_fEELi1ELi1ELi4EEEvRKNS_10GPUContextERKSt6vectorIPKNS_11DenseTensorESaISD_EEPSA_IPSB_SaISI_EET1_RKNS_5ArrayINS4_7details15BroadcastConfigEXT2_EEE' collect2: error: ld returned 1 exit status make[2]: *** [paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c_shared.dir/build.make:1204: paddle/fluid/inference/capi_exp/libpaddle_inference_c.so] Error 1 make[1]: *** [CMakeFiles/Makefile2:177249: paddle/fluid/inference/capi_exp/CMakeFiles/paddle_inference_c_shared.dir/all] Error 2 make: *** [Makefile:130: all] Error 2 The command '/bin/sh -c python3 -m pip install pyyaml -i https://pypi.tuna.tsinghua.edu.cn/simple && mkdir build-env && cd build-env && cmake .. -DWITH_PYTHON=OFF -DWITH_GPU=ON -DWITH_TESTING=OFF -DWITH_INFERENCE_API_TEST=OFF -DCMAKE_BUILD_TYPE=Release -DCUDA_ARCH_NAME=Auto -DON_INFER=ON -DWITH_MKL=ON -DWITH_TENSORRT=ON -DWITH_ONNXRUNTIME=ON -DCMAKE_C_COMPILER=`which gcc-8` -DCMAKE_CXX_COMPILER=`which g++-8` && make -j8' returned a non-zero code: 2

FROM nvcr.io/nvidia/tritonserver:22.03-py3 ENV DEBIAN_FRONTEND=noninteractive RUN apt-key del 7fa2af80 \ && wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.0-1_all.deb \ && dpkg -i cuda-keyring_1.0-1_all.deb RUN apt-get update \ && apt-get install -y --no-install-recommends \ cmake \ patchelf \ python3-dev \ unzip \ gcc-8 \ g++-8 \ libgl1 \ libssl-dev RUN update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-8 100 RUN update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-8 100 RUN git clone 'https://github.com/PaddlePaddle/Paddle.git' WORKDIR /opt/tritonserver/Paddle RUN git pull && git checkout release/2.4 RUN python3 -m pip install pyyaml -i https://pypi.tuna.tsinghua.edu.cn/simple && mkdir build-env && \ cd build-env && \ cmake .. -DWITH_PYTHON=OFF \ -DWITH_GPU=ON \ -DWITH_TESTING=OFF \ -DWITH_INFERENCE_API_TEST=OFF \ -DCMAKE_BUILD_TYPE=Release \ -DCUDA_ARCH_NAME=Auto \ -DON_INFER=ON \ -DWITH_MKL=ON \ -DWITH_TENSORRT=ON \ -DWITH_ONNXRUNTIME=ON \ -DCMAKE_C_COMPILER=`which gcc-8` -DCMAKE_CXX_COMPILER=`which g++-8` && \ make -j8

triton-inference-server / paddlepaddle_backend