PPYoloe+在PaddleServing上部署报错segmentatioin fault

问题确认 Search before asking

[X] 我已经搜索过问题，但是没有找到解答。I have searched the question and found no related answer.

请提出你的问题 Please ask your question

用自己的数据训练了ppyoloe+，图片大小大概在1200x1200到1150x1150左右。训练和推理时都正常。使用pipline的方式部署了ppyoloe+(部署正常未报错)，并用http方式进行预测。预测时直接卡住并报错segmentation fault。

测试环境

paddle-serving-app 0.9.0
paddle-serving-client 0.9.0
paddle-serving-server-gpu 0.9.0.post112

paddlepaddle-gpu 2.5.2.post117

模型导出生成的serving_server_conf.prototxt

feed_var {
name: "image"
alias_name: "image"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 1024
shape: 1024
}
feed_var {
name: "scale_factor"
alias_name: "scale_factor"
is_lod_tensor: false
feed_type: 1
shape: 2
}
fetch_var {
name: "multiclass_nms3_0.tmp_0"
alias_name: "multiclass_nms3_0.tmp_0"
is_lod_tensor: false
fetch_type: 1
shape: 6
}
fetch_var {
name: "multiclass_nms3_0.tmp_2"
alias_name: "multiclass_nms3_0.tmp_2"
is_lod_tensor: false
fetch_type: 2
}

web_service.py


from paddle_serving_server.web_service import WebService, Op
import logging
import numpy as np
import sys
import cv2
from paddle_serving_app.reader import *
import base64

class Ppyoloe(Op): def init_op(self): self.img_preprocess = Sequential([ BGR2RGB(), Div(255.0), Normalize([0., 0., 0.], [1., 1., 1.], False), Resize((1024, 1024)), Transpose((2, 0, 1)) ]) self.img_postprocess = RCNNPostprocess("label_list.txt", "output")

def preprocess(self, input_dicts, data_id, log_id):
    (_, input_dict), = input_dicts.items()
    imgs = []
    #print("keys", input_dict.keys())
    for key in input_dict.keys():
        data = base64.b64decode(input_dict[key].encode('utf8'))
        data = np.fromstring(data, np.uint8)
        im = cv2.imdecode(data, cv2.IMREAD_COLOR)
        print(im.shape)
        im = self.img_preprocess(im)
        # print(im.shape)
        imgs.append({
            "image": im[np.newaxis, :],
            # "im_shape": np.array(list(im.shape[1:])).reshape(-1)[np.newaxis, :],
            "scale_factor": np.array([1., 1.]).reshape(-1)[np.newaxis, :],
        })
    feed_dict = {
        "image": np.concatenate(
            [x["image"] for x in imgs], axis=0),
        # "im_shape": np.concatenate(
        #     [x["im_shape"] for x in imgs], axis=0),
        "scale_factor": np.concatenate(
            [x["scale_factor"] for x in imgs], axis=0)
    }
    for key in feed_dict.keys():
        print(key, feed_dict[key].shape)
    return feed_dict, False, None, ""

def postprocess(self, input_dicts, fetch_dict, data_id, log_id):
    print("fetch_dict = ", fetch_dict)
    res_dict = {
        "bbox_result":
        str(self.img_postprocess(
            fetch_dict, visualize=False))
    }
    print(res_dict)
    return res_dict, None, ""

class PpyoloeService(WebService): def get_pipeline_response(self, read_op): ppyoloe_op = Ppyoloe(name="ppyoloe", input_ops=[read_op]) return ppyoloe_op

ppyoloe_service = PpyoloeService(name="ppyoloe") ppyoloe_service.prepare_pipeline_config("config.yml") ppyoloe_service.run_service()

### config.yml
```yaml
dag:
  #op资源类型, True, 为线程模型；False，为进程模型
  is_thread_op: false
  #使用性能分析, True，生成Timeline性能数据，对性能有一定影响；False为不使用
  tracer:
    interval_s: 30
#http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时，不自动生成http_port
http_port: 9292
op:
  ppyoloe:
    #并发数，is_thread_op=True时，为线程并发；否则为进程并发
    concurrency: 10
    local_service_conf:
      #client类型，包括brpc, grpc和local_predictor.local_predictor不启动Serving服务，进程内预测
      client_type: local_predictor
      # device_type, 0=cpu, 1=gpu, 2=tensorRT, 3=arm cpu, 4=kunlun xpu
      device_type: 1
      #计算硬件ID，当devices为""或不写时为CPU预测；当devices为"0", "0,1,2"时为GPU预测，表示使用的GPU卡
      devices: '0'
      #Fetch结果列表，以bert_seq128_model中fetch_var的alias_name为准, 如果没有设置则全部返回
      fetch_list: ['multiclass_nms3_0.tmp_0']
      #模型路径
      model_config: serving_server/
#rpc端口, rpc_port和http_port不允许同时为空。当rpc_port为空且http_port不为空时，会自动将rpc_port设置为http_port+1
rpc_port: 9998
#worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程，每个进程内构建grpcSever和DAG
#当build_dag_each_worker=False时，框架会设置主线程grpc线程池的max_workers=worker_num
worker_num: 20

报错信息

--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   paddle::AnalysisPredictor::ZeroCopyRun()
1   paddle::framework::NaiveExecutor::Run()
2   paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, phi::Place const&)
3   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, phi::Place const&) const
4   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, phi::Place const&, paddle::framework::RuntimeContext*) const
5   std::_Function_handler<void (paddle::framework::InferShapeContext*), paddle::framework::details::OpInfoFiller<SqueezeInferShapeFunctor, (paddle::framework::details::OpInfoFillType)4>::operator()(char const*, paddle::framework::OpInfo*) const::{lambda(paddle::framework::InferShapeContext*)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::InferShapeContext*&&)
6   SqueezeInferShapeFunctor::operator()(paddle::framework::InferShapeContext*) const
7   phi::SqueezeInferMeta(phi::MetaTensor const&, std::vector<int, std::allocator<int> > const&, phi::MetaTensor*, phi::MetaTensor*)

----------------------
Error Message Summary:
----------------------
FatalError: `Segmentation fault` is detected by the operating system.
  [TimeInfo: *** Aborted at 1706101784 (unix time) try "date -d @1706101784" if you are using GNU date ***]
  [SignalInfo: *** SIGSEGV (@0x0) received by PID 17572 (TID 0x7f4bbee78700) from PID 0 ***]

ppyoloe_plus_reader.yml

worker_num: 8
eval_height: &eval_height 1024
eval_width: &eval_width 1024
eval_size: &eval_size [*eval_height, *eval_width]

TrainReader:
  sample_transforms:
    - Decode: {}
    # - RandomDistort: {}
    # - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
    # - RandomCrop: {}
    - Mixup: {alpha: 1.5, beta: 1.5}
    - RandomFlip: {}
    - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
  batch_transforms:
    # - BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608, 640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
    - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
    - Permute: {}
    - PadGT: {}
  batch_size: 1
  mixup_epoch: 15
  shuffle: true
  drop_last: true
  use_shared_memory: true
  collate_batch: true

EvalReader:
  sample_transforms:
    - Decode: {}
    - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
    - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
    - Permute: {}
  batch_size: 1

TestReader:
  inputs_def:
    image_shape: [3, *eval_height, *eval_width]
  sample_transforms:
    - Decode: {}
    - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
    - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
    - Permute: {}
  batch_size: 1

PaddlePaddle / PaddleYOLO