PaddleDetection使用TensorRT，对进行yolov3_mobilenet_v1_qat推理时报错

问题确认 Search before asking

[X] 我已经查询历史issue，没有报过同样bug。I have searched the issues and found no similar bug report.

bug描述 Describe the Bug

该问题为此Issue的后续，在那之后，由于环境和系统配置比较乱，我重新配置了一个新的系统环境，环境附后。按照官方教程，在VOC数据集上训练了一个yolov3_mobilenet_v1_270e_voc模型，量化config使用默认的yolov3_mobilenet_v1_qat.yml。按照原Issue指示更新到PaddlePaddle 2.3.0.rc0后，使用TRT_INT8不再报tensorrt_subgraph_pass相关错误，但报另一个Pass错误。

PaddleInference命令和报错如下：

(pdconfig) ubuntu@sunyuke:~/lxd-storage/xzy/PaddleCV/PaddleDetection$ python deploy/python/infer.py \
>   --model_dir=./inference_model/yolov3_mobilenet_v1_270e_qat/yolov3_mobilenet_v1_qat \
>   --image_file=./dataset/fire_smoke_voc/images/00001.jpg \
>   --device=GPU \
>   --run_mode=trt_int8
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:36: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead.
  'nearest': Image.NEAREST,
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:37: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead.
  'bilinear': Image.BILINEAR,
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:38: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.
  'bicubic': Image.BICUBIC,
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:39: DeprecationWarning: BOX is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BOX instead.
  'box': Image.BOX,
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:40: DeprecationWarning: LANCZOS is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.LANCZOS instead.
  'lanczos': Image.LANCZOS,
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:41: DeprecationWarning: HAMMING is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.HAMMING instead.
  'hamming': Image.HAMMING
-----------  Running Arguments -----------
batch_size: 1
camera_id: -1
cpu_threads: 1
device: GPU
enable_mkldnn: False
image_dir: None
image_file: ./dataset/fire_smoke_voc/images/00001.jpg
model_dir: ./inference_model/yolov3_mobilenet_v1_270e_qat/yolov3_mobilenet_v1_qat
output_dir: output
reid_batch_size: 50
reid_model_dir: None
run_benchmark: False
run_mode: trt_int8
save_images: False
save_mot_txt_per_img: False
save_mot_txts: False
scaled: False
threshold: 0.5
trt_calib_mode: False
trt_max_shape: 1280
trt_min_shape: 1
trt_opt_shape: 640
use_dark: True
use_gpu: False
video_file: None
------------------------------------------
-----------  Model Configuration -----------
Model Arch: YOLO
Transform Order:
--transform op: Resize
--transform op: NormalizeImage
--transform op: Permute
--------------------------------------------
Traceback (most recent call last):
  File "deploy/python/infer.py", line 773, in <module>
    main()
  File "deploy/python/infer.py", line 726, in main
    enable_mkldnn=FLAGS.enable_mkldnn)
  File "deploy/python/infer.py", line 94, in __init__
    enable_mkldnn=enable_mkldnn)
  File "deploy/python/infer.py", line 563, in load_predictor
    predictor = create_predictor(config)
ValueError: (InvalidArgument) Pass preln_embedding_eltwise_layernorm_fuse_pass has not been registered.
  [Hint: Expected Has(pass_type) == true, but received Has(pass_type):0 != true:1.] (at /paddle/paddle/fluid/framework/ir/pass.h:242)

PaddleServing命令和报错如下（会随机报以下两不同错误，出现前提条件不明）：

(pdconfig) ubuntu@sunyuke:~/lxd-storage/xzy/PaddleCV/PaddleDetection/inference_model/yolov3_mobilenet_v1_270e_qat_pdserving/yolov3_mobilenet_v1_qat$ python -m paddle_serving_server.serve --model serving_server --port 9393 --gpu_ids 1 --precision int8 --use_trt
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/runpy.py:125: RuntimeWarning: 'paddle_serving_server.serve' found in sys.modules after import of package 'paddle_serving_server', but prior to execution of 'paddle_serving_server.serve'; this may result in unpredictable behaviour
  warn(RuntimeWarning(msg))
Going to Run Comand
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle_serving_server/serving-gpu-101-0.8.3/serving -enable_model_toolkit -inferservice_path workdir_9393 -inferservice_file infer_service.prototxt -max_concurrency 0 -num_threads 4 -port 9393 -precision int8 -use_calib=False -reload_interval_s 10 -resource_path workdir_9393 -resource_file resource.prototxt -workflow_path workdir_9393 -workflow_file workflow.prototxt -bthread_concurrency 4 -max_body_size 536870912
I0100 00:00:00.000000 31581 op_repository.h:68] RAW: Succ regist op: GeneralDistKVInferOp
I0100 00:00:00.000000 31581 op_repository.h:68] RAW: Succ regist op: GeneralDistKVQuantInferOp
I0100 00:00:00.000000 31581 op_repository.h:68] RAW: Succ regist op: GeneralInferOp
I0100 00:00:00.000000 31581 op_repository.h:68] RAW: Succ regist op: GeneralReaderOp
I0100 00:00:00.000000 31581 op_repository.h:68] RAW: Succ regist op: GeneralRecOp
I0100 00:00:00.000000 31581 op_repository.h:68] RAW: Succ regist op: GeneralResponseOp
I0100 00:00:00.000000 31581 service_manager.h:79] RAW: Service[LoadGeneralModelService] insert successfully!
I0100 00:00:00.000000 31581 load_general_model_service.pb.h:333] RAW: Success regist service[LoadGeneralModelService][PN5baidu14paddle_serving9predictor26load_general_model_service27LoadGeneralModelServiceImplE]
I0100 00:00:00.000000 31581 service_manager.h:79] RAW: Service[GeneralModelService] insert successfully!
I0100 00:00:00.000000 31581 general_model_service.pb.h:1608] RAW: Success regist service[GeneralModelService][PN5baidu14paddle_serving9predictor13general_model23GeneralModelServiceImplE]
I0100 00:00:00.000000 31581 factory.h:155] RAW: Succ insert one factory, tag: PADDLE_INFER, base type N5baidu14paddle_serving9predictor11InferEngineE
W0100 00:00:00.000000 31581 paddle_engine.cpp:34] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine<PaddleInferenceEngine>->::baidu::paddle_serving::predictor::InferEngine, tag: PADDLE_INFER in macro!
I0513 15:06:24.095415 31585 analysis_predictor.cc:576] TensorRT subgraph engine is enabled
--- Running analysis [ir_graph_build_pass]
--- Running analysis [ir_graph_clean_pass]
--- Running analysis [ir_analysis_pass]
--- Running IR pass [conv_affine_channel_fuse_pass]
--- Running IR pass [adaptive_pool2d_convert_global_pass]
--- Running IR pass [conv_eltwiseadd_affine_channel_fuse_pass]
--- Running IR pass [shuffle_channel_detect_pass]
--- Running IR pass [quant_conv2d_dequant_fuse_pass]
--- Running IR pass [delete_quant_dequant_op_pass]
I0513 15:06:24.263115 31585 fuse_pass_base.cc:57] ---  detected 47 subgraphs
--- Running IR pass [delete_quant_dequant_filter_op_pass]
I0513 15:06:24.300499 31585 fuse_pass_base.cc:57] ---  detected 47 subgraphs
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v2]
--- Running IR pass [multihead_matmul_fuse_pass_v3]
--- Running IR pass [skip_layernorm_fuse_pass]
--- Running IR pass [unsqueeze2_eltwise_fuse_pass]
--- Running IR pass [squeeze2_matmul_fuse_pass]
--- Running IR pass [reshape2_matmul_fuse_pass]
--- Running IR pass [flatten2_matmul_fuse_pass]
--- Running IR pass [map_matmul_v2_to_mul_pass]
--- Running IR pass [map_matmul_v2_to_matmul_pass]
--- Running IR pass [map_matmul_to_mul_pass]
--- Running IR pass [fc_fuse_pass]
--- Running IR pass [conv_elementwise_add_fuse_pass]
--- Running IR pass [tensorrt_subgraph_pass]
I0513 15:06:24.358453 31585 tensorrt_subgraph_pass.cc:138] ---  detect a sub-graph with 145 nodes
I0513 15:06:24.391294 31585 tensorrt_subgraph_pass.cc:395] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time.
terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
  what():

--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------

----------------------
Error Message Summary:
----------------------
UnimplementedError: no OpConverter for optype [nearest_interp_v2]
  [Hint: it should not be null.] (at /paddle/paddle/fluid/inference/tensorrt/convert/op_converter.h:142)

Aborted (core dumped)

(pdconfig) ubuntu@sunyuke:~/lxd-storage/xzy/PaddleCV/PaddleDetection/inference_model/yolov3_mobilenet_v1_270e_qat_pdserving/yolov3_mobilenet_v1_qat$ python -m paddle_serving_server.serve --model serving_server --port 9393 --gpu_ids 1 --precision int8 --use_trt
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/runpy.py:125: RuntimeWarning: 'paddle_serving_server.serve' found in sys.modules after import of package 'paddle_serving_server', but prior to execution of 'paddle_serving_server.serve'; this may result in unpredictable behaviour
  warn(RuntimeWarning(msg))
Going to Run Comand
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle_serving_server/serving-gpu-101-0.8.3/serving -enable_model_toolkit -inferservice_path workdir_9393 -inferservice_file infer_service.prototxt -max_concurrency 0 -num_threads 4 -port 9393 -precision int8 -use_calib=False -reload_interval_s 10 -resource_path workdir_9393 -resource_file resource.prototxt -workflow_path workdir_9393 -workflow_file workflow.prototxt -bthread_concurrency 4 -max_body_size 536870912
I0100 00:00:00.000000 31237 op_repository.h:68] RAW: Succ regist op: GeneralDistKVInferOp
I0100 00:00:00.000000 31237 op_repository.h:68] RAW: Succ regist op: GeneralDistKVQuantInferOp
I0100 00:00:00.000000 31237 op_repository.h:68] RAW: Succ regist op: GeneralInferOp
I0100 00:00:00.000000 31237 op_repository.h:68] RAW: Succ regist op: GeneralReaderOp
I0100 00:00:00.000000 31237 op_repository.h:68] RAW: Succ regist op: GeneralRecOp
I0100 00:00:00.000000 31237 op_repository.h:68] RAW: Succ regist op: GeneralResponseOp
I0100 00:00:00.000000 31237 service_manager.h:79] RAW: Service[LoadGeneralModelService] insert successfully!
I0100 00:00:00.000000 31237 load_general_model_service.pb.h:333] RAW: Success regist service[LoadGeneralModelService][PN5baidu14paddle_serving9predictor26load_general_model_service27LoadGeneralModelServiceImplE]
I0100 00:00:00.000000 31237 service_manager.h:79] RAW: Service[GeneralModelService] insert successfully!
I0100 00:00:00.000000 31237 general_model_service.pb.h:1608] RAW: Success regist service[GeneralModelService][PN5baidu14paddle_serving9predictor13general_model23GeneralModelServiceImplE]
I0100 00:00:00.000000 31237 factory.h:155] RAW: Succ insert one factory, tag: PADDLE_INFER, base type N5baidu14paddle_serving9predictor11InferEngineE
W0100 00:00:00.000000 31237 paddle_engine.cpp:34] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine<PaddleInferenceEngine>->::baidu::paddle_serving::predictor::InferEngine, tag: PADDLE_INFER in macro!
I0513 12:57:24.111806 31240 analysis_predictor.cc:576] TensorRT subgraph engine is enabled
--- Running analysis [ir_graph_build_pass]
--- Running analysis [ir_graph_clean_pass]
--- Running analysis [ir_analysis_pass]
--- Running IR pass [conv_affine_channel_fuse_pass]
--- Running IR pass [adaptive_pool2d_convert_global_pass]
--- Running IR pass [conv_eltwiseadd_affine_channel_fuse_pass]
--- Running IR pass [shuffle_channel_detect_pass]
--- Running IR pass [quant_conv2d_dequant_fuse_pass]
--- Running IR pass [delete_quant_dequant_op_pass]
I0513 12:57:24.282593 31240 fuse_pass_base.cc:57] ---  detected 47 subgraphs
--- Running IR pass [delete_quant_dequant_filter_op_pass]
I0513 12:57:24.320891 31240 fuse_pass_base.cc:57] ---  detected 47 subgraphs
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v2]
--- Running IR pass [multihead_matmul_fuse_pass_v3]
--- Running IR pass [skip_layernorm_fuse_pass]
--- Running IR pass [unsqueeze2_eltwise_fuse_pass]
--- Running IR pass [squeeze2_matmul_fuse_pass]
--- Running IR pass [reshape2_matmul_fuse_pass]
--- Running IR pass [flatten2_matmul_fuse_pass]
--- Running IR pass [map_matmul_v2_to_mul_pass]
--- Running IR pass [map_matmul_v2_to_matmul_pass]
--- Running IR pass [map_matmul_to_mul_pass]
--- Running IR pass [fc_fuse_pass]
--- Running IR pass [conv_elementwise_add_fuse_pass]
--- Running IR pass [tensorrt_subgraph_pass]
I0513 12:57:24.429746 31240 tensorrt_subgraph_pass.cc:138] ---  detect a sub-graph with 8 nodes
I0513 12:57:24.433738 31240 tensorrt_subgraph_pass.cc:395] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time.
W0513 12:57:24.989765 31240 helper.h:107] Calibrator is not being used. Users must provide dynamic range for all tensors that are not Int32.
E0513 12:57:24.990360 31240 helper.h:111] Calibration failure occurred with no scaling factors detected. This could be due to no int8 calibrator or insufficient custom scales for network layers. Please see int8 sample to setup calibration correctly.
E0513 12:57:24.990375 31240 helper.h:111] Builder failed while configuring INT8 mode.
terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
  what():

--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------

----------------------
Error Message Summary:
----------------------
FatalError: Build TensorRT cuda engine failed! Please recheck you configurations related to paddle-TensorRT.
  [Hint: infer_engine_ should not be null.] (at /paddle/paddle/fluid/inference/tensorrt/engine.cc:252)

Aborted (core dumped)

复现环境 Environment

paddlepaddle-gpu=2.3.0.rc0.post101 paddledet=2.3.0 paddleslim=2.2.2 paddle-serving-server-gpu=0.8.3.post101

Ubuntu 18.04 Python 3.7

Nvidia Driver 430.64 CUDA 10.1 cudnn 7.6.5 TensorRT=6.0.1.5

是否愿意提交PR Are you willing to submit a PR?

[x] Yes I'd like to help by submitting a PR!

补充：以上为使用QAT量化的yolov3_mobilenet_v1_qat，导出为infer模型和serving模型运行结果。不使用TensorRT（不添加--run_mode=trt_int8或--use_trt参数时可正常运行）后续测试了未量化的yolov3_mobilenet_v1_270e_voc模型，以及yolov3_darkenet53_270e_voc模型，在PaddleInference下进行推理，同样报Pass preln_embedding_eltwise_layernorm_fuse_pass has not been registered.

报错信息如下：

(pdconfig) ubuntu@sunyuke:~/lxd-storage/xzy/PaddleCV/PaddleDetection$ python deploy/python/infer.py   --model_dir=./infe
rence_model/yolov3_darknet53_270e_voc_origin  --image_file=./dataset/fire_smoke_voc/images/00001.jpg   --device=GPU   --
run_mode=trt_int8
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:36: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead.
  'nearest': Image.NEAREST,
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:37: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead.
  'bilinear': Image.BILINEAR,
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:38: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.
  'bicubic': Image.BICUBIC,
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:39: DeprecationWarning: BOX is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BOX instead.
  'box': Image.BOX,
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:40: DeprecationWarning: LANCZOS is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.LANCZOS instead.
  'lanczos': Image.LANCZOS,
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:41: DeprecationWarning: HAMMING is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.HAMMING instead.
  'hamming': Image.HAMMING
-----------  Running Arguments -----------
batch_size: 1
camera_id: -1
cpu_threads: 1
device: GPU
enable_mkldnn: False
image_dir: None
image_file: ./dataset/fire_smoke_voc/images/00001.jpg
model_dir: ./inference_model/yolov3_darknet53_270e_voc_origin
output_dir: output
reid_batch_size: 50
reid_model_dir: None
run_benchmark: False
run_mode: trt_int8
save_images: False
save_mot_txt_per_img: False
save_mot_txts: False
scaled: False
threshold: 0.5
trt_calib_mode: False
trt_max_shape: 1280
trt_min_shape: 1
trt_opt_shape: 640
use_dark: True
use_gpu: False
video_file: None
------------------------------------------
-----------  Model Configuration -----------
Model Arch: YOLO
Transform Order:
--transform op: Resize
--transform op: NormalizeImage
--transform op: Permute
--------------------------------------------
Traceback (most recent call last):
  File "deploy/python/infer.py", line 773, in <module>
    main()
  File "deploy/python/infer.py", line 726, in main
    enable_mkldnn=FLAGS.enable_mkldnn)
  File "deploy/python/infer.py", line 94, in __init__
    enable_mkldnn=enable_mkldnn)
  File "deploy/python/infer.py", line 563, in load_predictor
    predictor = create_predictor(config)
ValueError: (InvalidArgument) Pass preln_embedding_eltwise_layernorm_fuse_pass has not been registered.
  [Hint: Expected Has(pass_type) == true, but received Has(pass_type):0 != true:1.] (at /paddle/paddle/fluid/framework/ir/pass.h:242)

(pdconfig) ubuntu@sunyuke:~/lxd-storage/xzy/PaddleCV/PaddleDetection$ python deploy/python/infer.py   --model_dir=./infe
rence_model/yolov3_mobilenet_v1_270e_voc_origin  --image_file=./dataset/fire_smoke_voc/images/00001.jpg   --device=GPU
 --run_mode=trt_int8
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:36: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead.
  'nearest': Image.NEAREST,
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:37: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead.
  'bilinear': Image.BILINEAR,
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:38: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.
  'bicubic': Image.BICUBIC,
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:39: DeprecationWarning: BOX is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BOX instead.
  'box': Image.BOX,
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:40: DeprecationWarning: LANCZOS is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.LANCZOS instead.
  'lanczos': Image.LANCZOS,
/home/ubuntu/anaconda3/envs/pdconfig/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:41: DeprecationWarning: HAMMING is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.HAMMING instead.
  'hamming': Image.HAMMING
-----------  Running Arguments -----------
batch_size: 1
camera_id: -1
cpu_threads: 1
device: GPU
enable_mkldnn: False
image_dir: None
image_file: ./dataset/fire_smoke_voc/images/00001.jpg
model_dir: ./inference_model/yolov3_mobilenet_v1_270e_voc_origin
output_dir: output
reid_batch_size: 50
reid_model_dir: None
run_benchmark: False
run_mode: trt_int8
save_images: False
save_mot_txt_per_img: False
save_mot_txts: False
scaled: False
threshold: 0.5
trt_calib_mode: False
trt_max_shape: 1280
trt_min_shape: 1
trt_opt_shape: 640
use_dark: True
use_gpu: False
video_file: None
------------------------------------------
-----------  Model Configuration -----------
Model Arch: YOLO
Transform Order:
--transform op: Resize
--transform op: NormalizeImage
--transform op: Permute
--------------------------------------------
Traceback (most recent call last):
  File "deploy/python/infer.py", line 773, in <module>
    main()
  File "deploy/python/infer.py", line 726, in main
    enable_mkldnn=FLAGS.enable_mkldnn)
  File "deploy/python/infer.py", line 94, in __init__
    enable_mkldnn=enable_mkldnn)
  File "deploy/python/infer.py", line 563, in load_predictor
    predictor = create_predictor(config)
ValueError: (InvalidArgument) Pass preln_embedding_eltwise_layernorm_fuse_pass has not been registered.
  [Hint: Expected Has(pass_type) == true, but received Has(pass_type):0 != true:1.] (at /paddle/paddle/fluid/framework/ir/pass.h:242)

PaddlePaddle / PaddleDetection