PaddlePaddle / Paddle-Lite

PaddlePaddle High Performance Deep Learning Inference Engine for Mobile and Edge (飞桨高性能深度学习端侧推理引擎)
https://www.paddlepaddle.org.cn/lite
Apache License 2.0
6.93k stars 1.61k forks source link

麒麟990上运行结果错误 #6094

Closed ZEROICEWANG closed 7 months ago

ZEROICEWANG commented 3 years ago

用动态图的方式编写的MobileNetV3, 之后使用ProgramTranslator转换成静态图,用opt转换好的模型,在华为手机上,调用npu运行,不论输入什么数据,输出结果都是一样的(被转换成nb文件的静态图模型,在电脑上可以正常推理),请问可能是哪里出了问题?谢谢! 此外,使用官方提供的mobilenetv1静态图模型转换得到的nb文件在手机上可以正常推理。 版本号: paddlepaddle==2.0 paddlelite==2.8

动态图转静态图代码:

model = MobileNetV3()
stage, _ = fluid.load_dygraph('./models/' + 'CPD_MNV3')
model.set_state_dict(stage)
in_np = np.ones([1, 3, 224, 224]).astype('float32')
input_var = paddle.to_tensor(in_np)
out=model(input_var)
save_dirname = './static_model_pt/model'
paddle.jit.save(model, save_dirname, input_spec=[input_var])

opt转换代码

opt = Opt()
# 2. 指定输入模型地址
opt.set_model_file("./static_model_pt/model.pdmodel")
opt.set_param_file("./static_model_pt/model.pdiparams")
# 3. 指定转化类型: arm、x86、opencl、xpu、npu
opt.set_valid_places("arm,npu")
# 4. 指定模型转化类型: naive_buffer、protobuf
opt.set_model_type("naive_buffer")
# 4. 输出模型地址
opt.set_optimize_out("model_opt_pt")
# 5. 执行模型优化
opt.run()

opt转换过程中输出的部分结果

WARNING: Logging before InitGoogleLogging() is written to STDERR
I0516 19:51:09.079234  4772 cxx_api.cc:295] Load model from file.
I0516 19:51:09.188239  4772 optimizer.h:285] == Running pass: lite_quant_dequant_fuse_pass
I0516 19:51:09.222232  4772 optimizer.h:309] == Finished running: lite_quant_dequant_fuse_pass
I0516 19:51:09.222232  4772 optimizer.h:285] == Running pass: weight_quantization_preprocess_pass
I0516 19:51:09.223233  4772 optimizer.h:309] == Finished running: weight_quantization_preprocess_pass
I0516 19:51:09.223233  4772 optimizer.h:285] == Running pass: adaptive_1x1_pool2d_convert_global_pass
I0516 19:51:09.224234  4772 optimizer.h:309] == Finished running: adaptive_1x1_pool2d_convert_global_pass
I0516 19:51:09.224234  4772 optimizer.h:285] == Running pass: lite_conv_elementwise_fuse_pass
I0516 19:51:09.263233  4772 pattern_matcher.cc:108] detected 67 subgraph
I0516 19:51:09.267233  4772 optimizer.h:309] == Finished running: lite_conv_elementwise_fuse_pass
I0516 19:51:09.267233  4772 optimizer.h:285] == Running pass: lite_conv_bn_fuse_pass
I0516 19:51:09.376235  4772 pattern_matcher.cc:108] detected 65 subgraph
I0516 19:51:09.410238  4772 optimizer.h:309] == Finished running: lite_conv_bn_fuse_pass
I0516 19:51:09.410238  4772 optimizer.h:285] == Running pass: lite_conv_elementwise_fuse_pass
I0516 19:51:09.431236  4772 optimizer.h:309] == Finished running: lite_conv_elementwise_fuse_pass
I0516 19:51:09.431236  4772 optimizer.h:285] == Running pass: lite_conv_conv_fuse_pass
I0516 19:51:09.431236  4772 optimizer.h:309] == Finished running: lite_conv_conv_fuse_pass
I0516 19:51:09.431236  4772 optimizer.h:285] == Running pass: lite_conv_activation_fuse_pass
I0516 19:51:09.459235  4772 pattern_matcher.cc:108] detected 18 subgraph
I0516 19:51:09.487236  4772 optimizer.h:309] == Finished running: lite_conv_activation_fuse_pass
I0516 19:51:09.487236  4772 optimizer.h:285] == Running pass: lite_var_conv_2d_activation_fuse_pass
I0516 19:51:09.487236  4772 optimizer.h:298]    - Skip lite_var_conv_2d_activation_fuse_pass because the target or kernel does not match.
I0516 19:51:09.487236  4772 optimizer.h:285] == Running pass: lite_match_matrix_activation_fuse_pass
I0516 19:51:09.487236  4772 optimizer.h:298]    - Skip lite_match_matrix_activation_fuse_pass because the target or kernel does not match.
I0516 19:51:09.487236  4772 optimizer.h:285] == Running pass: lite_squeeze2_matmul_fuse_pass
I0516 19:51:09.499238  4772 pattern_matcher.cc:108] detected 14 subgraph
I0516 19:51:09.500238  4772 optimizer.h:309] == Finished running: lite_squeeze2_matmul_fuse_pass
I0516 19:51:09.500238  4772 optimizer.h:285] == Running pass: lite_reshape2_matmul_fuse_pass
I0516 19:51:09.501240  4772 optimizer.h:309] == Finished running: lite_reshape2_matmul_fuse_pass
I0516 19:51:09.502235  4772 optimizer.h:285] == Running pass: lite_matmul_fuse_pass
I0516 19:51:09.505239  4772 pattern_matcher.cc:108] detected 14 subgraph
I0516 19:51:09.506235  4772 optimizer.h:309] == Finished running: lite_matmul_fuse_pass
I0516 19:51:09.506235  4772 optimizer.h:285] == Running pass: lite_fc_fuse_pass
I0516 19:51:09.513239  4772 pattern_matcher.cc:108] detected 14 subgraph
I0516 19:51:09.518236  4772 pattern_matcher.cc:108] detected 14 subgraph
I0516 19:51:09.518236  4772 optimizer.h:309] == Finished running: lite_fc_fuse_pass
I0516 19:51:09.518236  4772 optimizer.h:285] == Running pass: lite_shuffle_channel_fuse_pass
I0516 19:51:09.519238  4772 optimizer.h:309] == Finished running: lite_shuffle_channel_fuse_pass
I0516 19:51:09.519238  4772 optimizer.h:285] == Running pass: lite_transpose_softmax_transpose_fuse_pass
I0516 19:51:09.520239  4772 optimizer.h:309] == Finished running: lite_transpose_softmax_transpose_fuse_pass
I0516 19:51:09.520239  4772 optimizer.h:285] == Running pass: lite_interpolate_fuse_pass
I0516 19:51:09.523238  4772 optimizer.h:309] == Finished running: lite_interpolate_fuse_pass
I0516 19:51:09.523238  4772 optimizer.h:285] == Running pass: identity_scale_eliminate_pass
I0516 19:51:09.541239  4772 pattern_matcher.cc:108] detected 1 subgraph
I0516 19:51:09.542235  4772 optimizer.h:309] == Finished running: identity_scale_eliminate_pass
I0516 19:51:09.542235  4772 optimizer.h:285] == Running pass: lite_scales_fuse_pass
I0516 19:51:09.542235  4772 optimizer.h:298]    - Skip lite_scales_fuse_pass because the target or kernel does not match.
I0516 19:51:09.542235  4772 optimizer.h:285] == Running pass: lite_sequence_reverse_embedding_fuse_pass
I0516 19:51:09.542235  4772 optimizer.h:298]    - Skip lite_sequence_reverse_embedding_fuse_pass because the target or kernel does not match.
I0516 19:51:09.542235  4772 optimizer.h:285] == Running pass: elementwise_mul_constant_eliminate_pass
I0516 19:51:09.566237  4772 optimizer.h:309] == Finished running: elementwise_mul_constant_eliminate_pass
I0516 19:51:09.566237  4772 optimizer.h:285] == Running pass: lite_sequence_pool_concat_fuse_pass
I0516 19:51:09.566237  4772 optimizer.h:298]    - Skip lite_sequence_pool_concat_fuse_pass because the target or kernel does not match.
I0516 19:51:09.566237  4772 optimizer.h:285] == Running pass: lite_scale_activation_fuse_pass
I0516 19:51:09.566237  4772 optimizer.h:298]    - Skip lite_scale_activation_fuse_pass because the target or kernel does not match.
I0516 19:51:09.566237  4772 optimizer.h:285] == Running pass: lite_instance_norm_activation_fuse_pass
I0516 19:51:09.566237  4772 optimizer.h:298]    - Skip lite_instance_norm_activation_fuse_pass because the target or kernel does not match.
I0516 19:51:09.566237  4772 optimizer.h:285] == Running pass: lite_fc_prelu_fuse_pass
I0516 19:51:09.566237  4772 optimizer.h:298]    - Skip lite_fc_prelu_fuse_pass because the target or kernel does not match.
I0516 19:51:09.566237  4772 optimizer.h:285] == Running pass: identity_dropout_eliminate_pass
I0516 19:51:09.566237  4772 optimizer.h:298]    - Skip identity_dropout_eliminate_pass because the target or kernel does not match.
I0516 19:51:09.566237  4772 optimizer.h:285] == Running pass: __xpu__resnet_fuse_pass
I0516 19:51:09.566237  4772 optimizer.h:298]    - Skip __xpu__resnet_fuse_pass because the target or kernel does not match.
I0516 19:51:09.566237  4772 optimizer.h:285] == Running pass: __xpu__resnet_d_fuse_pass
I0516 19:51:09.566237  4772 optimizer.h:298]    - Skip __xpu__resnet_d_fuse_pass because the target or kernel does not match.
I0516 19:51:09.566237  4772 optimizer.h:285] == Running pass: __xpu__resnet_cbam_fuse_pass
I0516 19:51:09.566237  4772 optimizer.h:298]    - Skip __xpu__resnet_cbam_fuse_pass because the target or kernel does not match.
I0516 19:51:09.566237  4772 optimizer.h:285] == Running pass: __xpu__conv2d_fuse_pass
I0516 19:51:09.566237  4772 optimizer.h:298]    - Skip __xpu__conv2d_fuse_pass because the target or kernel does not match.
I0516 19:51:09.566237  4772 optimizer.h:285] == Running pass: __xpu__resblock_reduction_fuse_pass
I0516 19:51:09.566237  4772 optimizer.h:298]    - Skip __xpu__resblock_reduction_fuse_pass because the target or kernel does not match.
I0516 19:51:09.566237  4772 optimizer.h:285] == Running pass: __xpu__resblock_normal_fuse_pass
I0516 19:51:09.566237  4772 optimizer.h:298]    - Skip __xpu__resblock_normal_fuse_pass because the target or kernel does not match.
I0516 19:51:09.566237  4772 optimizer.h:285] == Running pass: __xpu__conv2d_link_previous_out_max_pass
I0516 19:51:09.566237  4772 optimizer.h:298]    - Skip __xpu__conv2d_link_previous_out_max_pass because the target or kernel does not match.
I0516 19:51:09.566237  4772 optimizer.h:285] == Running pass: __xpu__sfa_head_meanstd_fuse_pass
I0516 19:51:09.566237  4772 optimizer.h:298]    - Skip __xpu__sfa_head_meanstd_fuse_pass because the target or kernel does not match.
I0516 19:51:09.566237  4772 optimizer.h:285] == Running pass: __xpu__sfa_head_moment_fuse_pass
I0516 19:51:09.567234  4772 optimizer.h:298]    - Skip __xpu__sfa_head_moment_fuse_pass because the target or kernel does not match.
I0516 19:51:09.567234  4772 optimizer.h:285] == Running pass: __xpu__mmdnn_fuse_pass
I0516 19:51:09.567234  4772 optimizer.h:298]    - Skip __xpu__mmdnn_fuse_pass because the target or kernel does not match.
I0516 19:51:09.571233  4772 optimizer.h:285] == Running pass: __xpu__multi_encoder_fuse_pass
I0516 19:51:09.571233  4772 optimizer.h:298]    - Skip __xpu__multi_encoder_fuse_pass because the target or kernel does not match.
I0516 19:51:09.571233  4772 optimizer.h:285] == Running pass: __xpu__embedding_with_eltwise_add_fuse_pass
I0516 19:51:09.571233  4772 optimizer.h:298]    - Skip __xpu__embedding_with_eltwise_add_fuse_pass because the target or kernel does not match.
I0516 19:51:09.571233  4772 optimizer.h:285] == Running pass: __xpu__fc_fuse_pass
I0516 19:51:09.571233  4772 optimizer.h:298]    - Skip __xpu__fc_fuse_pass because the target or kernel does not match.
I0516 19:51:09.571233  4772 optimizer.h:285] == Running pass: __xpu__softmax_topk_fuse_pass
I0516 19:51:09.571233  4772 optimizer.h:298]    - Skip __xpu__softmax_topk_fuse_pass because the target or kernel does not match.
I0516 19:51:09.571233  4772 optimizer.h:285] == Running pass: __xpu__multi_encoder_slice_link_fuse_pass
I0516 19:51:09.571233  4772 optimizer.h:298]    - Skip __xpu__multi_encoder_slice_link_fuse_pass because the target or kernel does not match.
I0516 19:51:09.571233  4772 optimizer.h:285] == Running pass: __xpu__generate_sequence_fuse_pass
I0516 19:51:09.571233  4772 optimizer.h:298]    - Skip __xpu__generate_sequence_fuse_pass because the target or kernel does not match.
I0516 19:51:09.571233  4772 optimizer.h:285] == Running pass: quantized_op_attributes_inference_pass
I0516 19:51:09.571233  4772 optimizer.h:298]    - Skip quantized_op_attributes_inference_pass because the target or kernel does not match.
I0516 19:51:09.571233  4772 optimizer.h:285] == Running pass: restrict_quantized_op_with_same_input_output_scale_pass
I0516 19:51:09.571233  4772 optimizer.h:298]    - Skip restrict_quantized_op_with_same_input_output_scale_pass because the target or kernel does not match.
I0516 19:51:09.571233  4772 optimizer.h:285] == Running pass: npu_subgraph_pass
I0516 19:51:09.628247  4772 optimizer.h:309] == Finished running: npu_subgraph_pass
I0516 19:51:09.628247  4772 optimizer.h:285] == Running pass: huawei_ascend_npu_subgraph_pass
I0516 19:51:09.628247  4772 optimizer.h:298]    - Skip huawei_ascend_npu_subgraph_pass because the target or kernel does not match.
I0516 19:51:09.628247  4772 optimizer.h:285] == Running pass: imagination_nna_subgraph_pass
I0516 19:51:09.628247  4772 optimizer.h:298]    - Skip imagination_nna_subgraph_pass because the target or kernel does not match.
I0516 19:51:09.628247  4772 optimizer.h:285] == Running pass: xpu_subgraph_pass
I0516 19:51:09.628247  4772 optimizer.h:298]    - Skip xpu_subgraph_pass because the target or kernel does not match.
I0516 19:51:09.628247  4772 optimizer.h:285] == Running pass: bm_subgraph_pass
I0516 19:51:09.628247  4772 optimizer.h:298]    - Skip bm_subgraph_pass because the target or kernel does not match.
I0516 19:51:09.628247  4772 optimizer.h:285] == Running pass: apu_subgraph_pass
I0516 19:51:09.628247  4772 optimizer.h:298]    - Skip apu_subgraph_pass because the target or kernel does not match.
I0516 19:51:09.628247  4772 optimizer.h:285] == Running pass: rknpu_subgraph_pass
I0516 19:51:09.628247  4772 optimizer.h:298]    - Skip rknpu_subgraph_pass because the target or kernel does not match.
I0516 19:51:09.628247  4772 optimizer.h:285] == Running pass: mlu_subgraph_pass
I0516 19:51:09.628247  4772 optimizer.h:298]    - Skip mlu_subgraph_pass because the target or kernel does not match.
I0516 19:51:09.628247  4772 optimizer.h:285] == Running pass: control_flow_op_unused_inputs_and_outputs_eliminate_pass
I0516 19:51:09.629238  4772 optimizer.h:309] == Finished running: control_flow_op_unused_inputs_and_outputs_eliminate_pass
I0516 19:51:09.629238  4772 optimizer.h:285] == Running pass: static_kernel_pick_pass
I0516 19:51:09.633236  4772 optimizer.h:309] == Finished running: static_kernel_pick_pass
I0516 19:51:09.633236  4772 optimizer.h:285] == Running pass: remove_tf_redundant_ops_pass
I0516 19:51:09.634240  4772 optimizer.h:309] == Finished running: remove_tf_redundant_ops_pass
I0516 19:51:09.640236  4772 optimizer.h:285] == Running pass: variable_place_inference_pass
I0516 19:51:09.643235  4772 optimizer.h:309] == Finished running: variable_place_inference_pass
I0516 19:51:09.643235  4772 optimizer.h:285] == Running pass: mlu_postprocess_pass
I0516 19:51:09.643235  4772 optimizer.h:298]    - Skip mlu_postprocess_pass because the target or kernel does not match.
I0516 19:51:09.643235  4772 optimizer.h:285] == Running pass: argument_type_display_pass
I0516 19:51:09.643235  4772 optimizer.h:309] == Finished running: argument_type_display_pass
I0516 19:51:09.643235  4772 optimizer.h:285] == Running pass: type_target_cast_pass
I0516 19:51:09.644237  4772 optimizer.h:309] == Finished running: type_target_cast_pass
I0516 19:51:09.644237  4772 optimizer.h:285] == Running pass: variable_place_inference_pass
I0516 19:51:09.645236  4772 optimizer.h:309] == Finished running: variable_place_inference_pass
I0516 19:51:09.645236  4772 optimizer.h:285] == Running pass: argument_type_display_pass
I0516 19:51:09.645236  4772 optimizer.h:309] == Finished running: argument_type_display_pass
I0516 19:51:09.645236  4772 optimizer.h:285] == Running pass: io_copy_kernel_pick_pass
I0516 19:51:09.645236  4772 optimizer.h:309] == Finished running: io_copy_kernel_pick_pass
I0516 19:51:09.645236  4772 optimizer.h:285] == Running pass: argument_type_display_pass
I0516 19:51:09.645236  4772 optimizer.h:309] == Finished running: argument_type_display_pass
I0516 19:51:09.645236  4772 optimizer.h:285] == Running pass: variable_place_inference_pass
I0516 19:51:09.646234  4772 optimizer.h:309] == Finished running: variable_place_inference_pass
I0516 19:51:09.646234  4772 optimizer.h:285] == Running pass: argument_type_display_pass
I0516 19:51:09.646234  4772 optimizer.h:309] == Finished running: argument_type_display_pass
I0516 19:51:09.646234  4772 optimizer.h:285] == Running pass: type_precision_cast_pass
I0516 19:51:09.647238  4772 optimizer.h:309] == Finished running: type_precision_cast_pass
I0516 19:51:09.647238  4772 optimizer.h:285] == Running pass: variable_place_inference_pass
I0516 19:51:09.648236  4772 optimizer.h:309] == Finished running: variable_place_inference_pass
I0516 19:51:09.648236  4772 optimizer.h:285] == Running pass: argument_type_display_pass
I0516 19:51:09.648236  4772 optimizer.h:309] == Finished running: argument_type_display_pass
I0516 19:51:09.648236  4772 optimizer.h:285] == Running pass: type_layout_cast_pass
I0516 19:51:09.649240  4772 optimizer.h:309] == Finished running: type_layout_cast_pass
I0516 19:51:09.649240  4772 optimizer.h:285] == Running pass: argument_type_display_pass
I0516 19:51:09.649240  4772 optimizer.h:309] == Finished running: argument_type_display_pass
I0516 19:51:09.649240  4772 optimizer.h:285] == Running pass: variable_place_inference_pass
I0516 19:51:09.650236  4772 optimizer.h:309] == Finished running: variable_place_inference_pass
I0516 19:51:09.650236  4772 optimizer.h:285] == Running pass: argument_type_display_pass
I0516 19:51:09.651235  4772 optimizer.h:309] == Finished running: argument_type_display_pass
I0516 19:51:09.651235  4772 optimizer.h:285] == Running pass: runtime_context_assign_pass
I0516 19:51:09.651235  4772 optimizer.h:309] == Finished running: runtime_context_assign_pass
I0516 19:51:09.651235  4772 optimizer.h:285] == Running pass: argument_type_display_pass
I0516 19:51:09.651235  4772 optimizer.h:309] == Finished running: argument_type_display_pass
I0516 19:51:09.651235  4772 optimizer.h:285] == Running pass: lite_inplace_fuse_pass
I0516 19:51:09.651235  4772 pattern_matcher.cc:108] detected 1 subgraph
I0516 19:51:09.652235  4772 optimizer.h:309] == Finished running: lite_inplace_fuse_pass
I0516 19:51:09.652235  4772 optimizer.h:285] == Running pass: memory_optimize_pass
I0516 19:51:09.652235  4772 optimizer.h:298]    - Skip memory_optimize_pass because the target or kernel does not match.
I0516 19:51:09.653234  4772 generate_program_pass.h:37] insts.size 1
I0516 19:51:09.711238  4772 model_parser.cc:481] Save naive buffer model in model_opt_pt.nb successfully

MobileNetV3代码:

import paddle
  import paddle.fluid as fluid
  import numpy
  import sys
  from paddle.fluid.dygraph.jit import declarative
  import time
  import numpy as np

  class h_swish(fluid.dygraph.Layer):
      def __init__(self):
          super(h_swish, self).__init__()

      def forward(self, x):
          out = fluid.layers.relu6(x + 3.) / 6.
          return out * x

  class h_sigmoid(fluid.dygraph.Layer):
      def __init__(self):
          super(h_sigmoid, self).__init__()

      def forward(self, x):
          out = fluid.layers.relu6(x + 3.) / 6.
          return out

  class SqueezeBlock(fluid.dygraph.Layer):
      def __init__(self, exp_size, divide=4):
          super(SqueezeBlock, self).__init__()
          self.dense = fluid.dygraph.Sequential(
              fluid.dygraph.Linear(exp_size, exp_size // divide, act='relu', ),
              fluid.dygraph.Linear(exp_size // divide, exp_size),
              h_sigmoid()
          )

      def forward(self, x):

          out = fluid.layers.adaptive_pool2d(x, pool_size=[1, 1], pool_type='avg')
          out = fluid.layers.squeeze(out, axes=[2, 3])
          out = self.dense(out)
          out = fluid.layers.unsqueeze(out, axes=[2, 3])
          return out * x

  class Relu6_(fluid.dygraph.Layer):
      def __init__(self):
          super(Relu6_, self).__init__()

      def forward(self, x):
          return fluid.layers.relu6(x)

  class MobileBlock(fluid.dygraph.Layer):
      def __init__(self, in_channels, out_channels, kernal_size, stride, activation, SE, exp_size):
          super(MobileBlock, self).__init__()
          self.out_channels = out_channels
          self.SE = SE
          padding = (kernal_size - 1) // 2

          self.use_connect = stride == 1 and in_channels == out_channels

          self.conv = fluid.dygraph.Sequential(
              fluid.dygraph.Conv2D(in_channels, exp_size, filter_size=1, stride=1, padding=0),
              fluid.dygraph.BatchNorm(exp_size),
              activation()
          )
          self.depth_conv = fluid.dygraph.Sequential(
              fluid.dygraph.Conv2D(exp_size, exp_size, filter_size=kernal_size, stride=stride, padding=padding,
                                   groups=exp_size),
              fluid.dygraph.BatchNorm(exp_size),
          )

          if self.SE:
              self.squeeze_block = SqueezeBlock(exp_size)

          self.point_conv = fluid.dygraph.Sequential(
              fluid.dygraph.Conv2D(exp_size, out_channels, filter_size=1, stride=1, padding=0),
              fluid.dygraph.BatchNorm(out_channels),
              activation()
          )

      def forward(self, x):
          out = self.conv(x)
          out = self.depth_conv(out)

          # Squeeze and Excite
          if self.SE:
              out = self.squeeze_block(out)

          # point-wise conv
          out = self.point_conv(out)

          # connection
          if self.use_connect:
              return x + out
          else:
              return out

  class LayerBlock(fluid.dygraph.Layer):
      def __init__(self, layer_list=None):
          super(LayerBlock, self).__init__()
          self.layer_list = layer_list
          self.MobileBlock = []
          for i in range(len(layer_list)):
              in_channels, out_channels, kernal_size, stride, nonlinear, se, exp_size = layer_list[i]
              _mobileblock = self.add_sublayer('%d' % i,
                                               MobileBlock(in_channels, out_channels, kernal_size, stride, nonlinear, se,
                                                           exp_size))
              self.MobileBlock.append(_mobileblock)

      def forward(self, x):
          for _mobileblock in self.MobileBlock:
              x = _mobileblock(x)
          return x

  class MobileNetV3(fluid.dygraph.Layer):
      def __init__(self, dropout_rate=0.0):
          super(MobileNetV3, self).__init__()
          layers1 = [
              [16, 16, 3, 1, Relu6_, False, 16],
              [16, 24, 3, 2, Relu6_, False, 64],
              [24, 24, 3, 1, Relu6_, False, 72],
              [24, 40, 3, 2, Relu6_, True, 72],
              [40, 40, 3, 1, Relu6_, True, 120],
              [40, 40, 3, 1, Relu6_, True, 120],
              [40, 40, 3, 1, Relu6_, True, 120],

              [40, 40, 3, 1, Relu6_, True, 120],
              [40, 40, 3, 1, Relu6_, True, 120],

              [40, 80, 3, 2, h_swish, False, 240],
              [80, 80, 3, 1, h_swish, False, 200],
              [80, 80, 3, 1, h_swish, False, 184],
              [80, 80, 3, 1, h_swish, False, 184],

              [80, 112, 3, 1, h_swish, True, 480],
              [112, 112, 3, 1, h_swish, True, 672],
              [112, 160, 3, 1, h_swish, True, 672],
              [160, 160, 3, 1, h_swish, True, 672],
              [160, 160, 3, 2, h_swish, True, 672],
              [160, 160, 3, 1, h_swish, True, 672],
              [160, 160, 3, 1, h_swish, True, 960],
              [160, 160, 3, 1, h_swish, True, 960],
          ]

          init_conv_out = 16
          self.init_conv = fluid.dygraph.Sequential(
              fluid.dygraph.Conv2D(num_channels=3, num_filters=init_conv_out, filter_size=3, stride=2, padding=1),
              fluid.dygraph.BatchNorm(init_conv_out),
              h_swish(),
          )

          self.block = LayerBlock(layers1)

          out_conv1_in = 160
          out_conv1_out = 960
          self.out_conv1 = fluid.dygraph.Sequential(
              fluid.dygraph.Conv2D(out_conv1_in, out_conv1_out, filter_size=1, stride=1),
              fluid.dygraph.BatchNorm(out_conv1_out),
              h_swish(),
          )

          out_conv2_in = 960
          out_conv2_out = 1280
          self.out_conv2 = fluid.dygraph.Sequential(
              fluid.dygraph.Conv2D(out_conv2_in, out_conv2_out, filter_size=1, stride=1),
              h_swish(),
              fluid.dygraph.Dropout(dropout_rate),
              fluid.dygraph.Conv2D(out_conv2_out, 100, filter_size=1, stride=1),
          )
      @paddle.jit.to_static
      def forward(self, x):
          x = self.init_conv(x)
          x = self.block(x)
          x = self.out_conv1(x)
          out = fluid.layers.adaptive_pool2d(x, pool_size=[1, 1],pool_type='avg')
          out = self.out_conv2(out)
          out = fluid.layers.squeeze(out, axes=[2, 3])
          return out
paddle-bot-old[bot] commented 3 years ago

您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网文档常见问题历史Issue来寻求解答。祝您生活愉快~

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the APIFAQ and Github Issue to get the answer.Have a nice day!

ZEROICEWANG commented 3 years ago

经测试非NPU版在手机上可以正常推理,但是调用NPU时便会出现上述情况

hong19860320 commented 3 years ago

NPU支持mobilenetv3模型的,麻烦提供下推理模型,我们这里试试。

hong19860320 commented 3 years ago

https://github.com/PaddlePaddle/PaddleClas/blob/release/2.1/README_cn.md#%E7%A7%BB%E5%8A%A8%E7%AB%AF%E7%B3%BB%E5%88%97 这些mobilenetv3模型是支持的哦 image 我看你模型组网加了一些hard_swish、squeeze,这些暂时不支持NPU哈~ 目前支持的NPU算子有这些 https://github.com/PaddlePaddle/Paddle-Lite/blob/develop/lite/kernels/npu/bridges/paddle_use_bridges.h

ZEROICEWANG commented 3 years ago

好的,我将hard_swish与squeeze替换掉试一下,谢谢

ZEROICEWANG commented 3 years ago

尝试了一下,替换掉之后仍然有这个问题。 推理模型我传到百度网盘了,谢谢 链接:https://pan.baidu.com/s/1lrFax9Jx3QZXF0fYoyS3DA 提取码:sxjt input size:1x3x224x224 output size: 100