Closed hxuaj closed 8 months ago
用自己的数据训练了ppyoloe+,图片大小大概在1200x1200到1150x1150左右。训练和推理时都正常。 使用pipline的方式部署了ppyoloe+(部署正常未报错),并用http方式进行预测。预测时直接卡住并报错segmentation fault。
feed_var { name: "image" alias_name: "image" is_lod_tensor: false feed_type: 1 shape: 3 shape: 1024 shape: 1024 } feed_var { name: "scale_factor" alias_name: "scale_factor" is_lod_tensor: false feed_type: 1 shape: 2 } fetch_var { name: "multiclass_nms3_0.tmp_0" alias_name: "multiclass_nms3_0.tmp_0" is_lod_tensor: false fetch_type: 1 shape: 6 } fetch_var { name: "multiclass_nms3_0.tmp_2" alias_name: "multiclass_nms3_0.tmp_2" is_lod_tensor: false fetch_type: 2 }
from paddle_serving_server.web_service import WebService, Op import logging import numpy as np import sys import cv2 from paddle_serving_app.reader import * import base64
class Ppyoloe(Op): def init_op(self): self.img_preprocess = Sequential([ BGR2RGB(), Div(255.0), Normalize([0., 0., 0.], [1., 1., 1.], False), Resize((1024, 1024)), Transpose((2, 0, 1)) ]) self.img_postprocess = RCNNPostprocess("label_list.txt", "output")
def preprocess(self, input_dicts, data_id, log_id): (_, input_dict), = input_dicts.items() imgs = [] #print("keys", input_dict.keys()) for key in input_dict.keys(): data = base64.b64decode(input_dict[key].encode('utf8')) data = np.fromstring(data, np.uint8) im = cv2.imdecode(data, cv2.IMREAD_COLOR) print(im.shape) im = self.img_preprocess(im) # print(im.shape) imgs.append({ "image": im[np.newaxis, :], # "im_shape": np.array(list(im.shape[1:])).reshape(-1)[np.newaxis, :], "scale_factor": np.array([1., 1.]).reshape(-1)[np.newaxis, :], }) feed_dict = { "image": np.concatenate( [x["image"] for x in imgs], axis=0), # "im_shape": np.concatenate( # [x["im_shape"] for x in imgs], axis=0), "scale_factor": np.concatenate( [x["scale_factor"] for x in imgs], axis=0) } for key in feed_dict.keys(): print(key, feed_dict[key].shape) return feed_dict, False, None, "" def postprocess(self, input_dicts, fetch_dict, data_id, log_id): print("fetch_dict = ", fetch_dict) res_dict = { "bbox_result": str(self.img_postprocess( fetch_dict, visualize=False)) } print(res_dict) return res_dict, None, ""
class PpyoloeService(WebService): def get_pipeline_response(self, read_op): ppyoloe_op = Ppyoloe(name="ppyoloe", input_ops=[read_op]) return ppyoloe_op
ppyoloe_service = PpyoloeService(name="ppyoloe") ppyoloe_service.prepare_pipeline_config("config.yml") ppyoloe_service.run_service()
### config.yml ```yaml dag: #op资源类型, True, 为线程模型;False,为进程模型 is_thread_op: false #使用性能分析, True,生成Timeline性能数据,对性能有一定影响;False为不使用 tracer: interval_s: 30 #http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时,不自动生成http_port http_port: 9292 op: ppyoloe: #并发数,is_thread_op=True时,为线程并发;否则为进程并发 concurrency: 10 local_service_conf: #client类型,包括brpc, grpc和local_predictor.local_predictor不启动Serving服务,进程内预测 client_type: local_predictor # device_type, 0=cpu, 1=gpu, 2=tensorRT, 3=arm cpu, 4=kunlun xpu device_type: 1 #计算硬件ID,当devices为""或不写时为CPU预测;当devices为"0", "0,1,2"时为GPU预测,表示使用的GPU卡 devices: '0' #Fetch结果列表,以bert_seq128_model中fetch_var的alias_name为准, 如果没有设置则全部返回 fetch_list: ['multiclass_nms3_0.tmp_0'] #模型路径 model_config: serving_server/ #rpc端口, rpc_port和http_port不允许同时为空。当rpc_port为空且http_port不为空时,会自动将rpc_port设置为http_port+1 rpc_port: 9998 #worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程,每个进程内构建grpcSever和DAG #当build_dag_each_worker=False时,框架会设置主线程grpc线程池的max_workers=worker_num worker_num: 20
-------------------------------------- C++ Traceback (most recent call last): -------------------------------------- 0 paddle::AnalysisPredictor::ZeroCopyRun() 1 paddle::framework::NaiveExecutor::Run() 2 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, phi::Place const&) 3 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, phi::Place const&) const 4 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, phi::Place const&, paddle::framework::RuntimeContext*) const 5 std::_Function_handler<void (paddle::framework::InferShapeContext*), paddle::framework::details::OpInfoFiller<SqueezeInferShapeFunctor, (paddle::framework::details::OpInfoFillType)4>::operator()(char const*, paddle::framework::OpInfo*) const::{lambda(paddle::framework::InferShapeContext*)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::InferShapeContext*&&) 6 SqueezeInferShapeFunctor::operator()(paddle::framework::InferShapeContext*) const 7 phi::SqueezeInferMeta(phi::MetaTensor const&, std::vector<int, std::allocator<int> > const&, phi::MetaTensor*, phi::MetaTensor*) ---------------------- Error Message Summary: ---------------------- FatalError: `Segmentation fault` is detected by the operating system. [TimeInfo: *** Aborted at 1706101784 (unix time) try "date -d @1706101784" if you are using GNU date ***] [SignalInfo: *** SIGSEGV (@0x0) received by PID 17572 (TID 0x7f4bbee78700) from PID 0 ***]
worker_num: 8 eval_height: &eval_height 1024 eval_width: &eval_width 1024 eval_size: &eval_size [*eval_height, *eval_width] TrainReader: sample_transforms: - Decode: {} # - RandomDistort: {} # - RandomExpand: {fill_value: [123.675, 116.28, 103.53]} # - RandomCrop: {} - Mixup: {alpha: 1.5, beta: 1.5} - RandomFlip: {} - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2} batch_transforms: # - BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608, 640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False} - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none} - Permute: {} - PadGT: {} batch_size: 1 mixup_epoch: 15 shuffle: true drop_last: true use_shared_memory: true collate_batch: true EvalReader: sample_transforms: - Decode: {} - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2} - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none} - Permute: {} batch_size: 1 TestReader: inputs_def: image_shape: [3, *eval_height, *eval_width] sample_transforms: - Decode: {} - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2} - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none} - Permute: {} batch_size: 1
你好,这块请前往https://github.com/PaddlePaddle/FastDeploy 提问谢谢
问题确认 Search before asking
请提出你的问题 Please ask your question
用自己的数据训练了ppyoloe+,图片大小大概在1200x1200到1150x1150左右。训练和推理时都正常。 使用pipline的方式部署了ppyoloe+(部署正常未报错),并用http方式进行预测。预测时直接卡住并报错segmentation fault。
测试环境
模型导出生成的serving_server_conf.prototxt
web_service.py
class Ppyoloe(Op): def init_op(self): self.img_preprocess = Sequential([ BGR2RGB(), Div(255.0), Normalize([0., 0., 0.], [1., 1., 1.], False), Resize((1024, 1024)), Transpose((2, 0, 1)) ]) self.img_postprocess = RCNNPostprocess("label_list.txt", "output")
class PpyoloeService(WebService): def get_pipeline_response(self, read_op): ppyoloe_op = Ppyoloe(name="ppyoloe", input_ops=[read_op]) return ppyoloe_op
ppyoloe_service = PpyoloeService(name="ppyoloe") ppyoloe_service.prepare_pipeline_config("config.yml") ppyoloe_service.run_service()
报错信息
ppyoloe_plus_reader.yml