PaddlePaddle / PaddleDetection

Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
Apache License 2.0
12.69k stars 2.87k forks source link

部署到serving失败 #774

Closed ash12358 closed 4 years ago

ash12358 commented 4 years ago

我是使用cascade_rcnn_cls_aware_r200_vd_fpn_dcnv2_nonlocal_softnms.yml这个配置文件训练的object365数据集,然后用 python tools/export_serving_model.py -c configs/obj365/cascade_rcnn_cls_aware_r200_vd_fpn_dcnv2_nonlocal_softnms.yml --output_dir=serving -o weights=output/cascade_rcnn_cls_aware_r200_vd_fpn_dcnv2_nonlocal_softnms/1300000 这个导出模型。 然后用 python -m paddle_serving_server.serve --model serving/cascade_rcnn_cls_aware_r200_vd_fpn_dcnv2_nonlocal_softnms/serving_server --thread 10 --port 9292 来启动,就报了以下错误。但如果使用官方提供的权重和相应的配置文件来导出和运行,就没有问题,这可能是哪里的问题呢?

/opt/Anaconda3/envs/paddle/lib/python3.7/site-packages/paddle_serving_server/serving-cpu-avx-openblas-0.2.1/serving -enable_model_toolkit -inferservice_path workdir -inferservice_file infer_service.prototxt -max_concurrency 0 -num_threads 10 -port 9292 -reload_interval_s 10 -resource_path workdir -resource_file resource.prototxt -workflow_path workdir -workflow_file workflow.prototxt -bthread_concurrency 10 -max_body_size 536870912
I0100 00:00:00.000000 33607 op_repository.h:65] RAW: Succ regist op: GeneralTextResponseOp
I0100 00:00:00.000000 33607 op_repository.h:65] RAW: Succ regist op: GeneralTextReaderOp
I0100 00:00:00.000000 33607 op_repository.h:65] RAW: Succ regist op: GeneralInferOp
I0100 00:00:00.000000 33607 op_repository.h:65] RAW: Succ regist op: GeneralDistKVQuantInferOp
I0100 00:00:00.000000 33607 op_repository.h:65] RAW: Succ regist op: GeneralDistKVInferOp
I0100 00:00:00.000000 33607 op_repository.h:65] RAW: Succ regist op: GeneralReaderOp
I0100 00:00:00.000000 33607 op_repository.h:65] RAW: Succ regist op: GeneralCopyOp
I0100 00:00:00.000000 33607 op_repository.h:65] RAW: Succ regist op: GeneralResponseOp
I0100 00:00:00.000000 33607 service_manager.h:61] RAW: Service[LoadGeneralModelService] insert successfully!
I0100 00:00:00.000000 33607 load_general_model_service.pb.h:299] RAW: Success regist service[LoadGeneralModelService][PN5baidu14paddle_serving9predictor26load_general_model_service27LoadGeneralModelServiceImplE]
I0100 00:00:00.000000 33607 service_manager.h:61] RAW: Service[GeneralModelService] insert successfully!
I0100 00:00:00.000000 33607 general_model_service.pb.h:1473] RAW: Success regist service[GeneralModelService][PN5baidu14paddle_serving9predictor13general_model23GeneralModelServiceImplE]
I0100 00:00:00.000000 33607 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_ANALYSIS, base type N5baidu14paddle_serving9predictor11InferEngineE
W0100 00:00:00.000000 33607 fluid_cpu_engine.cpp:25] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine<FluidCpuAnalysisCore>->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_ANALYSIS in macro!
I0100 00:00:00.000000 33607 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_ANALYSIS_DIR, base type N5baidu14paddle_serving9predictor11InferEngineE
W0100 00:00:00.000000 33607 fluid_cpu_engine.cpp:31] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine< FluidCpuAnalysisDirCore>->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_ANALYSIS_DIR in macro!
I0100 00:00:00.000000 33607 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_ANALYSIS_DIR_SIGMOID, base type N5baidu14paddle_serving9predictor11InferEngineE
W0100 00:00:00.000000 33607 fluid_cpu_engine.cpp:37] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine< FluidCpuAnalysisDirWithSigmoidCore>->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_ANALYSIS_DIR_SIGMOID in macro!
I0100 00:00:00.000000 33607 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_NATIVE, base type N5baidu14paddle_serving9predictor11InferEngineE
W0100 00:00:00.000000 33607 fluid_cpu_engine.cpp:42] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine<FluidCpuNativeCore>->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_NATIVE in macro!
I0100 00:00:00.000000 33607 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_NATIVE_DIR, base type N5baidu14paddle_serving9predictor11InferEngineE
W0100 00:00:00.000000 33607 fluid_cpu_engine.cpp:47] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine<FluidCpuNativeDirCore>->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_NATIVE_DIR in macro!
I0100 00:00:00.000000 33607 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_NATIVE_DIR_SIGMOID, base type N5baidu14paddle_serving9predictor11InferEngineE
W0100 00:00:00.000000 33607 fluid_cpu_engine.cpp:53] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine< FluidCpuNativeDirWithSigmoidCore>->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_NATIVE_DIR_SIGMOID in macro!
--- Running analysis [ir_graph_build_pass]
--- Running analysis [ir_graph_clean_pass]
--- Running analysis [ir_analysis_pass]
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [attention_lstm_fuse_pass]
--- Running IR pass [seqconv_eltadd_relu_fuse_pass]
--- Running IR pass [seqpool_cvm_concat_fuse_pass]
--- Running IR pass [fc_lstm_fuse_pass]
--- Running IR pass [mul_lstm_fuse_pass]
--- Running IR pass [fc_gru_fuse_pass]
--- Running IR pass [mul_gru_fuse_pass]
--- Running IR pass [seq_concat_fc_fuse_pass]
--- Running IR pass [fc_fuse_pass]
--- Running IR pass [repeated_fc_relu_fuse_pass]
--- Running IR pass [squared_mat_sub_fuse_pass]
--- Running IR pass [conv_bn_fuse_pass]
--- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
--- Running IR pass [conv_transpose_bn_fuse_pass]
--- Running IR pass [conv_transpose_eltwiseadd_bn_fuse_pass]
--- Running IR pass [is_test_pass]
--- Running IR pass [runtime_context_cache_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [ir_graph_to_program_pass]
terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
  what():

--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------

----------------------
Error Message Summary:
----------------------
Error: Operator py_func has not been registered
  [Hint: op_info_ptr should not be null.] at (/paddle/paddle/fluid/framework/op_info.h:140)
qingqing01 commented 4 years ago

Error: Operator py_func has not been registered [Hint: op_info_ptr should not be null.] at (/paddle/paddle/fluid/framework/op_info.h:140)

@ash12358 看样子导出的模型包含了py_func,你把softnms 换成普通的nms看下。

ash12358 commented 4 years ago

@qingqing01 非常感谢,将MultiClassSoftNMS修改为普通的nms后可以在serving上成功运行起来。但是参考着 https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/faster_rcnn_model 进行客户端调试时,出现了一下错误 /opt/Anaconda3/envs/paddle/lib/python3.7/site-packages/paddle_serving_server/serving-cpu-avx-openblas-0.2.1/serving -enable_model_toolkit -inferservice_path workdir -inferservice_file infer_service.prototxt -max_concurrency 0 -num_threads 10 -port 9292 -reload_interval_s 10 -resource_path workdir -resource_file resource.prototxt -workflow_path workdir -workflow_file workflow.prototxt -bthread_concurrency 10 -max_body_size 536870912 I0100 00:00:00.000000 14035 op_repository.h:65] RAW: Succ regist op: GeneralTextResponseOp I0100 00:00:00.000000 14035 op_repository.h:65] RAW: Succ regist op: GeneralTextReaderOp I0100 00:00:00.000000 14035 op_repository.h:65] RAW: Succ regist op: GeneralInferOp I0100 00:00:00.000000 14035 op_repository.h:65] RAW: Succ regist op: GeneralDistKVQuantInferOp I0100 00:00:00.000000 14035 op_repository.h:65] RAW: Succ regist op: GeneralDistKVInferOp I0100 00:00:00.000000 14035 op_repository.h:65] RAW: Succ regist op: GeneralReaderOp I0100 00:00:00.000000 14035 op_repository.h:65] RAW: Succ regist op: GeneralCopyOp I0100 00:00:00.000000 14035 op_repository.h:65] RAW: Succ regist op: GeneralResponseOp I0100 00:00:00.000000 14035 service_manager.h:61] RAW: Service[LoadGeneralModelService] insert successfully! I0100 00:00:00.000000 14035 load_general_model_service.pb.h:299] RAW: Success regist service[LoadGeneralModelService][PN5baidu14paddle_serving9predictor26load_general_model_service27LoadGeneralModelServiceImplE] I0100 00:00:00.000000 14035 service_manager.h:61] RAW: Service[GeneralModelService] insert successfully! I0100 00:00:00.000000 14035 general_model_service.pb.h:1473] RAW: Success regist service[GeneralModelService][PN5baidu14paddle_serving9predictor13general_model23GeneralModelServiceImplE] I0100 00:00:00.000000 14035 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_ANALYSIS, base type N5baidu14paddle_serving9predictor11InferEngineE W0100 00:00:00.000000 14035 fluid_cpu_engine.cpp:25] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_ANALYSIS in macro! I0100 00:00:00.000000 14035 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_ANALYSIS_DIR, base type N5baidu14paddle_serving9predictor11InferEngineE W0100 00:00:00.000000 14035 fluid_cpu_engine.cpp:31] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine< FluidCpuAnalysisDirCore>->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_ANALYSIS_DIR in macro! I0100 00:00:00.000000 14035 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_ANALYSIS_DIR_SIGMOID, base type N5baidu14paddle_serving9predictor11InferEngineE W0100 00:00:00.000000 14035 fluid_cpu_engine.cpp:37] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine< FluidCpuAnalysisDirWithSigmoidCore>->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_ANALYSIS_DIR_SIGMOID in macro! I0100 00:00:00.000000 14035 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_NATIVE, base type N5baidu14paddle_serving9predictor11InferEngineE W0100 00:00:00.000000 14035 fluid_cpu_engine.cpp:42] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_NATIVE in macro! I0100 00:00:00.000000 14035 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_NATIVE_DIR, base type N5baidu14paddle_serving9predictor11InferEngineE W0100 00:00:00.000000 14035 fluid_cpu_engine.cpp:47] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_NATIVE_DIR in macro! I0100 00:00:00.000000 14035 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_NATIVE_DIR_SIGMOID, base type N5baidu14paddle_serving9predictor11InferEngineE W0100 00:00:00.000000 14035 fluid_cpu_engine.cpp:53] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine< FluidCpuNativeDirWithSigmoidCore>->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_NATIVE_DIR_SIGMOID in macro! --- Running analysis [ir_graph_build_pass] --- Running analysis [ir_graph_clean_pass] --- Running analysis [ir_analysis_pass] --- Running IR pass [simplify_with_basic_ops_pass] --- Running IR pass [attention_lstm_fuse_pass] --- Running IR pass [seqconv_eltadd_relu_fuse_pass] --- Running IR pass [seqpool_cvm_concat_fuse_pass] --- Running IR pass [fc_lstm_fuse_pass] --- Running IR pass [mul_lstm_fuse_pass] --- Running IR pass [fc_gru_fuse_pass] --- Running IR pass [mul_gru_fuse_pass] --- Running IR pass [seq_concat_fc_fuse_pass] --- Running IR pass [fc_fuse_pass] --- Running IR pass [repeated_fc_relu_fuse_pass] --- Running IR pass [squared_mat_sub_fuse_pass] --- Running IR pass [conv_bn_fuse_pass] --- Running IR pass [conv_eltwiseadd_bn_fuse_pass] --- Running IR pass [conv_transpose_bn_fuse_pass] --- Running IR pass [conv_transpose_eltwiseadd_bn_fuse_pass] --- Running IR pass [is_test_pass] --- Running IR pass [runtime_context_cache_pass] --- Running analysis [ir_params_sync_among_devices_pass] --- Running analysis [adjust_cudnn_workspace_size_pass] --- Running analysis [inference_op_replace_pass] --- Running analysis [ir_graph_to_program_pass] terminate called after throwing an instance of 'paddle::platform::EnforceNotMet' what():


C++ Call Stacks (More useful to developers):


Python Call Stacks (More useful to users):

File "/opt/Anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2525, in append_op attrs=kwargs.get("attrs", None)) File "/opt/Anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py", line 243, in impl attrs={'axis': axis}) File "/data/ssh/PaddleDetection/ppdet/modeling/backbones/fpn.py", line 93, in _add_topdown_lateral return lateral + topdown File "/data/ssh/PaddleDetection/ppdet/modeling/backbones/fpn.py", line 144, in get_output top_output) File "/data/ssh/PaddleDetection/ppdet/modeling/architectures/cascade_rcnn_cls_aware.py", line 95, in build body_feats, spatial_scale = self.fpn.get_output(body_feats) File "/data/ssh/PaddleDetection/ppdet/modeling/architectures/cascade_rcnn_cls_aware.py", line 217, in test return self.build(feed_vars, 'test') File "tools/export_serving_model.py", line 198, in main test_fetches = model.test(feed_vars) File "tools/export_serving_model.py", line 217, in main()


Error Message Summary:

Error: ShapeError: broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 256, 27, 40] and the shape of Y = [1, 256, 28, 40]. Received [27] in X is not equal to [28] in Y at (/paddle/paddle/fluid/operators/elementwise/elementwise_op_function.h:145) [operator < elementwise_add > error]

qingqing01 commented 4 years ago

@ash12358 两阶段FPN模型主要导出模型时,输入图片的尺寸需要是32的整数倍。另外,要确认下serving预处理有没有对输入的Pading操作。 @wangjiawei04 可以帮确认下吗?

bjjwwang commented 4 years ago

我是使用cascade_rcnn_cls_aware_r200_vd_fpn_dcnv2_nonlocal_softnms.yml这个配置文件训练的object365数据集,然后用 python tools/export_serving_model.py -c configs/obj365/cascade_rcnn_cls_aware_r200_vd_fpn_dcnv2_nonlocal_softnms.yml --output_dir=serving -o weights=output/cascade_rcnn_cls_aware_r200_vd_fpn_dcnv2_nonlocal_softnms/1300000 这个导出模型。 然后用 python -m paddle_serving_server.serve --model serving/cascade_rcnn_cls_aware_r200_vd_fpn_dcnv2_nonlocal_softnms/serving_server --thread 10 --port 9292 来启动,就报了以下错误。但如果使用官方提供的权重和相应的配置文件来导出和运行,就没有问题,这可能是哪里的问题呢?

/opt/Anaconda3/envs/paddle/lib/python3.7/site-packages/paddle_serving_server/serving-cpu-avx-openblas-0.2.1/serving -enable_model_toolkit -inferservice_path workdir -inferservice_file infer_service.prototxt -max_concurrency 0 -num_threads 10 -port 9292 -reload_interval_s 10 -resource_path workdir -resource_file resource.prototxt -workflow_path workdir -workflow_file workflow.prototxt -bthread_concurrency 10 -max_body_size 536870912 I0100 00:00:00.000000 33607 op_repository.h:65] RAW: Succ regist op: GeneralTextResponseOp I0100 00:00:00.000000 33607 op_repository.h:65] RAW: Succ regist op: GeneralTextReaderOp I0100 00:00:00.000000 33607 op_repository.h:65] RAW: Succ regist op: GeneralInferOp I0100 00:00:00.000000 33607 op_repository.h:65] RAW: Succ regist op: GeneralDistKVQuantInferOp I0100 00:00:00.000000 33607 op_repository.h:65] RAW: Succ regist op: GeneralDistKVInferOp I0100 00:00:00.000000 33607 op_repository.h:65] RAW: Succ regist op: GeneralReaderOp I0100 00:00:00.000000 33607 op_repository.h:65] RAW: Succ regist op: GeneralCopyOp I0100 00:00:00.000000 33607 op_repository.h:65] RAW: Succ regist op: GeneralResponseOp I0100 00:00:00.000000 33607 service_manager.h:61] RAW: Service[LoadGeneralModelService] insert successfully! I0100 00:00:00.000000 33607 load_general_model_service.pb.h:299] RAW: Success regist service[LoadGeneralModelService][PN5baidu14paddle_serving9predictor26load_general_model_service27LoadGeneralModelServiceImplE] I0100 00:00:00.000000 33607 service_manager.h:61] RAW: Service[GeneralModelService] insert successfully! I0100 00:00:00.000000 33607 general_model_service.pb.h:1473] RAW: Success regist service[GeneralModelService][PN5baidu14paddle_serving9predictor13general_model23GeneralModelServiceImplE] I0100 00:00:00.000000 33607 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_ANALYSIS, base type N5baidu14paddle_serving9predictor11InferEngineE W0100 00:00:00.000000 33607 fluid_cpu_engine.cpp:25] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_ANALYSIS in macro! I0100 00:00:00.000000 33607 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_ANALYSIS_DIR, base type N5baidu14paddle_serving9predictor11InferEngineE W0100 00:00:00.000000 33607 fluid_cpu_engine.cpp:31] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine< FluidCpuAnalysisDirCore>->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_ANALYSIS_DIR in macro! I0100 00:00:00.000000 33607 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_ANALYSIS_DIR_SIGMOID, base type N5baidu14paddle_serving9predictor11InferEngineE W0100 00:00:00.000000 33607 fluid_cpu_engine.cpp:37] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine< FluidCpuAnalysisDirWithSigmoidCore>->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_ANALYSIS_DIR_SIGMOID in macro! I0100 00:00:00.000000 33607 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_NATIVE, base type N5baidu14paddle_serving9predictor11InferEngineE W0100 00:00:00.000000 33607 fluid_cpu_engine.cpp:42] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_NATIVE in macro! I0100 00:00:00.000000 33607 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_NATIVE_DIR, base type N5baidu14paddle_serving9predictor11InferEngineE W0100 00:00:00.000000 33607 fluid_cpu_engine.cpp:47] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_NATIVE_DIR in macro! I0100 00:00:00.000000 33607 factory.h:121] RAW: Succ insert one factory, tag: FLUID_CPU_NATIVE_DIR_SIGMOID, base type N5baidu14paddle_serving9predictor11InferEngineE W0100 00:00:00.000000 33607 fluid_cpu_engine.cpp:53] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine< FluidCpuNativeDirWithSigmoidCore>->::baidu::paddle_serving::predictor::InferEngine, tag: FLUID_CPU_NATIVE_DIR_SIGMOID in macro! --- Running analysis [ir_graph_build_pass] --- Running analysis [ir_graph_clean_pass] --- Running analysis [ir_analysis_pass] --- Running IR pass [simplify_with_basic_ops_pass] --- Running IR pass [attention_lstm_fuse_pass] --- Running IR pass [seqconv_eltadd_relu_fuse_pass] --- Running IR pass [seqpool_cvm_concat_fuse_pass] --- Running IR pass [fc_lstm_fuse_pass] --- Running IR pass [mul_lstm_fuse_pass] --- Running IR pass [fc_gru_fuse_pass] --- Running IR pass [mul_gru_fuse_pass] --- Running IR pass [seq_concat_fc_fuse_pass] --- Running IR pass [fc_fuse_pass] --- Running IR pass [repeated_fc_relu_fuse_pass] --- Running IR pass [squared_mat_sub_fuse_pass] --- Running IR pass [conv_bn_fuse_pass] --- Running IR pass [conv_eltwiseadd_bn_fuse_pass] --- Running IR pass [conv_transpose_bn_fuse_pass] --- Running IR pass [conv_transpose_eltwiseadd_bn_fuse_pass] --- Running IR pass [is_test_pass] --- Running IR pass [runtime_context_cache_pass] --- Running analysis [ir_params_sync_among_devices_pass] --- Running analysis [adjust_cudnn_workspace_size_pass] --- Running analysis [inference_op_replace_pass] --- Running analysis [ir_graph_to_program_pass] terminate called after throwing an instance of 'paddle::platform::EnforceNotMet' what():

C++ Call Stacks (More useful to developers):

Error Message Summary:

Error: Operator py_func has not been registered [Hint: op_info_ptr should not be null.] at (/paddle/paddle/fluid/framework/op_info.h:140)

方便给下client的代码吗

https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/cascade_rcnn 可以参考下这个

wget --no-check-certificate https://paddle-serving.bj.bcebos.com/pddet_demo/cascade_rcnn_r50_fpx_1x_serving.tar.gz
tar xf cascade_rcnn_r50_fpx_1x_serving.tar.gz
python -m paddle_serving_server_gpu.serve --model serving_server --port 9292 --gpu_id 0
#另一个终端开启client
python test_client.py 
ash12358 commented 4 years ago

@wangjiawei04

client代码:

# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from paddle_serving_client import Client
from paddle_serving_app.reader import Sequential
from paddle_serving_app.reader import *
import sys
import numpy as np

preprocess = Sequential([
    File2Image(), BGR2RGB(), Div(255.0),
    Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225], False),
    Resize(640, 640), Transpose((2, 0, 1))
])

postprocess = RCNNPostprocess("label_list.txt", "output")
client = Client()

client.load_client_config(
    "cascade_serving/serving_client/serving_client_conf.prototxt")
client.connect(['127.0.0.1:9292'])

im = preprocess(sys.argv[3])
fetch_map = client.predict(
    feed={
        "image": im,
        "im_info": np.array(list(im.shape[1:]) + [1.0]),
        "im_shape": np.array(list(im.shape[1:]) + [1.0])
    },
    fetch=["multiclass_nms_0.tmp_0"])
#fetch_map["image"] = sys.argv[3]
#postprocess(fetch_map)
print(fetch_map)

执行命令: python tools/new_test_client.py cascade_serving/serving_server/serving_client_conf.prototxt cascade_serving/infer_cfg.yml obj365_val_000000505576.jpg

bjjwwang commented 4 years ago

@wangjiawei04

client代码:

# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from paddle_serving_client import Client
from paddle_serving_app.reader import Sequential
from paddle_serving_app.reader import *
import sys
import numpy as np

preprocess = Sequential([
    File2Image(), BGR2RGB(), Div(255.0),
    Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225], False),
    Resize(640, 640), Transpose((2, 0, 1))
])

postprocess = RCNNPostprocess("label_list.txt", "output")
client = Client()

client.load_client_config(
    "cascade_serving/serving_client/serving_client_conf.prototxt")
client.connect(['127.0.0.1:9292'])

im = preprocess(sys.argv[3])
fetch_map = client.predict(
    feed={
        "image": im,
        "im_info": np.array(list(im.shape[1:]) + [1.0]),
        "im_shape": np.array(list(im.shape[1:]) + [1.0])
    },
    fetch=["multiclass_nms_0.tmp_0"])
#fetch_map["image"] = sys.argv[3]
#postprocess(fetch_map)
print(fetch_map)

执行命令: python tools/new_test_client.py cascade_serving/serving_server/serving_client_conf.prototxt cascade_serving/infer_cfg.yml obj365_val_000000505576.jpg

preprocess = Sequential([
    File2Image(), BGR2RGB(), Div(255.0),
    Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225], False),
    Resize(640, 640), Transpose((2, 0, 1)), PadStride(32)
])

改成这样

bjjwwang commented 4 years ago

不过你是pip安装的app包还是用Serving编译出来的App包?

ash12358 commented 4 years ago

这样改之后,在调试时服务端不再报错,但客户端返回的fetch_map是None。

WARNING: Logging before InitGoogleLogging() is written to STDERR
E0526 14:23:49.728278 12659 config_manager.cpp:217] Not found key in configue: cluster
E0526 14:23:49.728319 12659 config_manager.cpp:234] Not found key in configue: split_tag_name
E0526 14:23:49.728327 12659 config_manager.cpp:235] Not found key in configue: tag_candidates
E0526 14:23:49.728343 12659 config_manager.cpp:202] Not found key in configue: connect_timeout_ms
E0526 14:23:49.728350 12659 config_manager.cpp:203] Not found key in configue: rpc_timeout_ms
E0526 14:23:49.728355 12659 config_manager.cpp:205] Not found key in configue: hedge_request_timeout_ms
E0526 14:23:49.728361 12659 config_manager.cpp:207] Not found key in configue: connect_retry_count
E0526 14:23:49.728368 12659 config_manager.cpp:209] Not found key in configue: hedge_fetch_retry_count
E0526 14:23:49.728374 12659 config_manager.cpp:211] Not found key in configue: max_connection_per_host
E0526 14:23:49.728379 12659 config_manager.cpp:212] Not found key in configue: connection_type
E0526 14:23:49.728385 12659 config_manager.cpp:219] Not found key in configue: load_balance_strategy
E0526 14:23:49.728391 12659 config_manager.cpp:221] Not found key in configue: cluster_filter_strategy
E0526 14:23:49.728397 12659 config_manager.cpp:226] Not found key in configue: protocol
E0526 14:23:49.728404 12659 config_manager.cpp:227] Not found key in configue: compress_type
E0526 14:23:49.728410 12659 config_manager.cpp:228] Not found key in configue: package_size
E0526 14:23:49.728415 12659 config_manager.cpp:230] Not found key in configue: max_channel_per_request
E0526 14:23:49.728421 12659 config_manager.cpp:234] Not found key in configue: split_tag_name
E0526 14:23:49.728427 12659 config_manager.cpp:235] Not found key in configue: tag_candidates
I0526 14:23:49.752717 12659 naming_service_thread.cpp:209] brpc::policy::ListNamingService("127.0.0.1:9292"): added 1
W0526 14:24:09.859925 12659 predictor.hpp:129] inference call failed, message: [E1008]Reached timeout=20000ms @0.0.0.0:0
E0526 14:24:10.665114 12659 general_model.cpp:245] failed call predictor with req: insts { tensor_array { float_data: -0.45680279 float_data: -0.45166537 float_data: -0.4302634 float_data: -0.3980673 float_data: -0.37289152 float_data: -0.35405427 float_data: -0.338642 float_data: -0.30268 float_data: -0.30268 float_data: -0.30268 float_data: -0.30268 float_data: -0.30268 float_data: -0.30268 float_data: -0.27870536 float_data: -0.2684305 float_data: -0.24959326 float_data: -0.23418099 float_data: -0.2256186 float_data: -0.2050689 (后面省略一大堆)

我是用pip安装的paddle-serving-app

bjjwwang commented 4 years ago

这样改之后,在调试时服务端不再报错,但客户端返回的fetch_map是None。

WARNING: Logging before InitGoogleLogging() is written to STDERR
E0526 14:23:49.728278 12659 config_manager.cpp:217] Not found key in configue: cluster
E0526 14:23:49.728319 12659 config_manager.cpp:234] Not found key in configue: split_tag_name
E0526 14:23:49.728327 12659 config_manager.cpp:235] Not found key in configue: tag_candidates
E0526 14:23:49.728343 12659 config_manager.cpp:202] Not found key in configue: connect_timeout_ms
E0526 14:23:49.728350 12659 config_manager.cpp:203] Not found key in configue: rpc_timeout_ms
E0526 14:23:49.728355 12659 config_manager.cpp:205] Not found key in configue: hedge_request_timeout_ms
E0526 14:23:49.728361 12659 config_manager.cpp:207] Not found key in configue: connect_retry_count
E0526 14:23:49.728368 12659 config_manager.cpp:209] Not found key in configue: hedge_fetch_retry_count
E0526 14:23:49.728374 12659 config_manager.cpp:211] Not found key in configue: max_connection_per_host
E0526 14:23:49.728379 12659 config_manager.cpp:212] Not found key in configue: connection_type
E0526 14:23:49.728385 12659 config_manager.cpp:219] Not found key in configue: load_balance_strategy
E0526 14:23:49.728391 12659 config_manager.cpp:221] Not found key in configue: cluster_filter_strategy
E0526 14:23:49.728397 12659 config_manager.cpp:226] Not found key in configue: protocol
E0526 14:23:49.728404 12659 config_manager.cpp:227] Not found key in configue: compress_type
E0526 14:23:49.728410 12659 config_manager.cpp:228] Not found key in configue: package_size
E0526 14:23:49.728415 12659 config_manager.cpp:230] Not found key in configue: max_channel_per_request
E0526 14:23:49.728421 12659 config_manager.cpp:234] Not found key in configue: split_tag_name
E0526 14:23:49.728427 12659 config_manager.cpp:235] Not found key in configue: tag_candidates
I0526 14:23:49.752717 12659 naming_service_thread.cpp:209] brpc::policy::ListNamingService("127.0.0.1:9292"): added 1
W0526 14:24:09.859925 12659 predictor.hpp:129] inference call failed, message: [E1008]Reached timeout=20000ms @0.0.0.0:0
E0526 14:24:10.665114 12659 general_model.cpp:245] failed call predictor with req: insts { tensor_array { float_data: -0.45680279 float_data: -0.45166537 float_data: -0.4302634 float_data: -0.3980673 float_data: -0.37289152 float_data: -0.35405427 float_data: -0.338642 float_data: -0.30268 float_data: -0.30268 float_data: -0.30268 float_data: -0.30268 float_data: -0.30268 float_data: -0.30268 float_data: -0.27870536 float_data: -0.2684305 float_data: -0.24959326 float_data: -0.23418099 float_data: -0.2256186 float_data: -0.2050689 (后面省略一大堆)

我是用pip安装的paddle-serving-app

这个是超时,需要用gpu的机器来跑。你有gpu的机器吗? 如果没有,需要到你的python lib目录 以python2.7为例 应该在你的 $PYTHONROOT/lib/python2.7/site-packages/paddle_serving_client/init.py的第89行,把超时限制的数字改大。

https://github.com/PaddlePaddle/Serving/blob/develop/python/paddle_serving_client/__init__.py#L89

qingqing01 commented 4 years ago

@ash12358 最新的代码中提供了serving部署相关。有问题可以开新issue,这个issue就暂且关闭了。