PaddlePaddle / PaddleX

Low-code development tool based on PaddlePaddle(飞桨低代码开发工具)
Apache License 2.0
4.76k stars 935 forks source link

paddleX 调用paddleslim prune出错 #644

Open HuangLonghao opened 3 years ago

HuangLonghao commented 3 years ago

问题类型:模型训练
问题描述
paddleX 训练分割任务,使用HRNet-18网络进行训练,之后调用pdx.slim.prune.analysis函数进行敏感度分析,再进行训练后出错 ====================

2021-03-25 16:01:33 [INFO] Finish prune program, before FLOPs:9601764.0, after prune FLOPs:1468713.0, remaining ratio:0.15296283057988094 Traceback (most recent call last): File "/media/pc/d6768da5-56a5-4476-a0f5-b1ac5772dcf0/hlhtest/paddleX/PaddleX/tutorials/slim/prune/semantic_segmentation/unet_prune_train.py", line 46, in use_vdl=True) File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/paddlex/cv/models/hrnet.py", line 178, in train early_stop_patience, resume_checkpoint) File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/paddlex/cv/models/deeplabv3p.py", line 358, in train early_stop_patience=early_stop_patience) File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/paddlex/cv/models/base.py", line 501, in train_loop fetch_list=list(self.train_outputs.values())) File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1110, in run six.reraise(*sys.exc_info()) File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/six.py", line 703, in reraise raise value File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1108, in run return_merged=return_merged) File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1251, in _run_impl return_merged=return_merged) File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 913, in _run_parallel tensors = exe.run(fetch_var_names, return_merged)._move_to_list() ValueError: In user code:

File "/media/pc/d6768da5-56a5-4476-a0f5-b1ac5772dcf0/hlhtest/paddleX/PaddleX/tutorials/slim/prune/semantic_segmentation/unet_prune_train.py", line 46, in <module>
  use_vdl=True)
File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/paddlex/cv/models/hrnet.py", line 178, in train
  early_stop_patience, resume_checkpoint)
File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/paddlex/cv/models/deeplabv3p.py", line 338, in train
  self.build_program()
File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/paddlex/cv/models/base.py", line 105, in build_program
  self.train_inputs, self.train_outputs = self.build_net(mode='train')
File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/paddlex/cv/models/hrnet.py", line 98, in build_net
  model_out = model.build_net(inputs)
File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/paddlex/cv/nets/segmentation/hrnet.py", line 96, in build_net
  name='conv-2')
File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/paddlex/cv/nets/segmentation/hrnet.py", line 197, in _conv_bn_layer
  bias_attr=False)
File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/paddle/fluid/layers/nn.py", line 1622, in conv2d
  "data_format": data_format,
File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
  return self.main_program.current_block().append_op(*args, **kwargs)
File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/paddle/fluid/framework.py", line 3018, in append_op
  attrs=kwargs.get("attrs", None))
File "/home/pc/anaconda3/envs/paddle_env/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2102, in __init__
  for frame in traceback.extract_stack():

InvalidArgumentError: The number of input's channels should be equal to filter's channels * groups for Op(Conv). But received: the input's channels is 243, the input's shape is [4, 243, 128, 128]; the filter's channels is 244, the filter's shape is [70, 244, 1, 1]; the groups is 1, the data_format is NCHW. The error may come from wrong data_format setting.
  [Hint: Expected input_channels == filter_dims[1] * groups, but received input_channels:243 != filter_dims[1] * groups:244.] (at /paddle/paddle/fluid/operators/conv_op.cc:96)
  [operator < conv2d > error]
FlyingQianMM commented 3 years ago

使用paddleslim 1.2.0 + paddlepaddle_gpu 1.8.5可以正常剪裁训练,可以更换至该版本后,再接着训练

qing130 commented 2 years ago

在AI studio 上的经典版运行官方例子也有类似的报错

官方例子:https://github.com/PaddlePaddle/PaddleX/blob/develop/tutorials/slim/prune/image_classification/mobilenetv2_prune.py 环境:PP2.2.0+ paddlex 开发版. CPU和GPU环境都一样报错