PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
21.94k stars 5.52k forks source link

PaddleSeg-release-2.9/contrib/MedicalSeg执行模型转换报错,synapse #64251

Open kulongwei opened 2 months ago

kulongwei commented 2 months ago

请提出你的问题 Please ask your question

按照官网数据,一步步布置好了数据核对环境,但是貌似执行下来会报错:在PaddleSeg-release-2.9/contrib/MedicalSeg下执行:python export.py --config configs/synapse/swinunet_abdomen_224_224_1_14k_5e-2.yml --model_path /data01/ww/synapse/models/SwinUNet/model.pdparams

报错信息是:2024-05-13 23:26:17 [INFO] Loading pretrained model from https://paddleseg.bj.bcebos.com/paddleseg3d/synapse/abdomen/swinunet_abdomen_224_224_1_14k_5e-2/swinunet_pretrained.zip savepath = /root/.paddleseg/tmp/tmp7cnp2945/swinunet_pretrained.zip Connecting to https://paddleseg.bj.bcebos.com/paddleseg3d/synapse/abdomen/swinunet_abdomen_224_224_1_14k_5e-2/swinunet_pretrained.zip Downloading swinunet_pretrained.zip [==================================================] 100.00% Uncompress swinunet_pretrained.zip [==================================================] 100.00% 2024-05-13 23:27:19 [WARNING] layers_up.0.expand.weight is not in pretrained model 2024-05-13 23:27:19 [WARNING] layers_up.0.norm.weight is not in pretrained model 2024-05-13 23:27:19 [WARNING] layers_up.0.norm.bias is not in pretrained model 2024-05-13 23:27:19 [WARNING] layers_up.1.upsample.expand.weight is not in pretrained model 2024-05-13 23:27:19 [WARNING] layers_up.1.upsample.norm.weight is not in pretrained model 2024-05-13 23:27:19 [WARNING] layers_up.1.upsample.norm.bias is not in pretrained model 2024-05-13 23:27:19 [WARNING] layers_up.2.upsample.expand.weight is not in pretrained model 2024-05-13 23:27:19 [WARNING] layers_up.2.upsample.norm.weight is not in pretrained model 2024-05-13 23:27:19 [WARNING] layers_up.2.upsample.norm.bias is not in pretrained model 2024-05-13 23:27:19 [WARNING] concat_back_dim.1.weight is not in pretrained model 2024-05-13 23:27:19 [WARNING] concat_back_dim.1.bias is not in pretrained model 2024-05-13 23:27:19 [WARNING] concat_back_dim.2.weight is not in pretrained model 2024-05-13 23:27:19 [WARNING] concat_back_dim.2.bias is not in pretrained model 2024-05-13 23:27:19 [WARNING] concat_back_dim.3.weight is not in pretrained model 2024-05-13 23:27:19 [WARNING] concat_back_dim.3.bias is not in pretrained model 2024-05-13 23:27:19 [WARNING] norm.weight is not in pretrained model 2024-05-13 23:27:19 [WARNING] norm.bias is not in pretrained model 2024-05-13 23:27:19 [WARNING] norm_up.weight is not in pretrained model 2024-05-13 23:27:19 [WARNING] norm_up.bias is not in pretrained model 2024-05-13 23:27:19 [WARNING] up.expand.weight is not in pretrained model 2024-05-13 23:27:19 [WARNING] up.norm.weight is not in pretrained model 2024-05-13 23:27:19 [WARNING] up.norm.bias is not in pretrained model 2024-05-13 23:27:19 [WARNING] output.weight is not in pretrained model 2024-05-13 23:27:19 [INFO] There are 217/240 variables loaded into SwinUNet. which: no nvcc in (/opt/dtk-23.04/bin:/opt/dtk-23.04/llvm/bin:/opt/dtk-23.04/hip/bin:/opt/dtk-23.04/hip/bin/hipify:/data01/tools/miniconda3/envs/synapse/bin:/data01/tools/miniconda3/condabin:/usr/lib64/qt-3.3/bin:/root/perl5/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/rocm/bin:/opt/rocm/profiler/bin:/opt/rocm/opencl/bin:/opt/rocm/bin:/opt/rocm/profiler/bin:/opt/rocm/opencl/bin/x86_64:/root/bin) 2024-05-13 23:27:21 [INFO] Loaded trained params of model successfully. Traceback (most recent call last): File "export.py", line 145, in main(args) File "export.py", line 124, in main paddle.jit.save(new_net, save_path) File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/decorator.py", line 232, in fun return caller(func, *(extras + args), kw) File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/wrapped_decorator.py", line 26, in impl return wrapped_func(args, kwargs) File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/dygraph/jit.py", line 649, in wrapper func(layer, path, input_spec, configs) File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/decorator.py", line 232, in fun return caller(func, (extras + args), kw) File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/wrapped_decorator.py", line 26, in impl return wrapped_func(*args, kwargs) File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/dygraph/base.py", line 67, in impl return func(args, kwargs) File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/dygraph/jit.py", line 928, in save inner_input_spec, with_hook=with_hook) File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 580, in concrete_program_specify_input_spec is_train=self._is_train_mode()) File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 485, in get_concrete_program concrete_program, partial_program_layer = self._program_cache[cache_key] File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 955, in getitem self._caches[item_id] = self._build_once(item) File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 944, in _build_once cache_key.kwargs) File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/decorator.py", line 232, in fun return caller(func, (extras + args), kw) File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/wrapped_decorator.py", line 26, in impl return wrapped_func(*args, *kwargs) File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/dygraph/base.py", line 67, in impl return func(args, **kwargs) File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 895, in from_func_spec error_data.raise_new_exception() File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/error.py", line 350, in raise_newexception six.exec("raise new_exception from None") File "", line 1, in ValueError: In transformed code:

File "export.py", line 74, in forward
    outs = self.net(x)
File "/data01/ww/synapse/PaddleSeg-release-2.9/contrib/MedicalSeg/medicalseg/models/swinunet.py", line 200, in forward
    x, x_downsample = self.backbone(x)
File "/data01/ww/synapse/PaddleSeg-release-2.9/contrib/MedicalSeg/medicalseg/models/backbones/swin_transformer.py", line 670, in forward
    out = self.forward_features(x)
File "/data01/ww/synapse/PaddleSeg-release-2.9/contrib/MedicalSeg/medicalseg/models/backbones/swin_transformer.py", line 655, in forward_features
    x = self.patch_embed(x)
File "/data01/ww/synapse/PaddleSeg-release-2.9/contrib/MedicalSeg/medicalseg/models/backbones/swin_transformer.py", line 526, in forward
    def forward(self, x):
        B, C, H, W = x.shape
        x = self.proj(x)
        ~~~~~~~~~~~~~~~~ <--- HERE

        x = x.flatten(2).transpose([0, 2, 1])  # B Ph*Pw C

File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 1014, in __call__
    return self._dygraph_call_func(*inputs, **kwargs)
File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 993, in _dygraph_call_func
    outputs = self.forward(*inputs, **kwargs)
File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/nn/layer/conv.py", line 724, in forward
    use_cudnn=self._use_cudnn,
File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/nn/functional/conv.py", line 277, in _conv_nd
    type=op_type, inputs=inputs, outputs=outputs, attrs=attrs
File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 45, in append_op
    return self.main_program.current_block().append_op(*args, **kwargs)
File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/framework.py", line 4046, in append_op
    attrs=kwargs.get("attrs", None),
File "/data01/tools/miniconda3/envs/synapse/lib/python3.7/site-packages/paddle/fluid/framework.py", line 3037, in __init__
    self.desc.infer_shape(self.block.desc)

ValueError: (InvalidArgument) The number of input's channels should be equal to filter's channels * groups for Op(Conv). But received: the input's channels is -1, the input's shape is [1, -1, -1, -1]; the filter's channels is 3, the filter's shape is [96, 3, 4, 4]; the groups is 1, the data_format is NCHW. The error may come from wrong data_format setting.

[Hint: Expected input_channels == filter_dims[1] groups, but received input_channels:-1 != filter_dims[1] groups:3.] (at /home/paddle/Paddle-2.4.2/paddle/fluid/operators/conv_op.cc:131) [operator < conv2d > error]

kulongwei commented 2 months ago

注意:paddle2.4.2正常在DCU硬件上运行!

Bobholamovic commented 2 months ago

看起来是一个动转静错误,可以尝试在导出时指定--input_shape参数。

kulongwei commented 2 months ago

看起来是一个动转静错误,可以尝试在导出时指定--input_shape参数。

命令怎么写呢 ?

Bobholamovic commented 2 months ago

例如:

python export.py --config configs/synapse/swinunet_abdomen_224_224_1_14k_5e-2.yml --model_path /data01/ww/synapse/models/SwinUNet/model.pdparams --input_shape 1 3 -1 -1

建议通过--input_shape固定batch size为1。--input_shape的batch以外的维度指定什么值需要根据模型的实际情况确定~

kulongwei commented 2 months ago

W0527 19:02:13.346259 24548 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 90.0, Driver API Version: 50400.0, Runtime API Version: 50400.0 2024-05-27 19:02:14 [INFO] Loading pretrained model from https://paddleseg.bj.bcebos.com/paddleseg3d/synapse/abdomen/swinunet_abdomen_224_224_1_14k_5e-2/swinunet_pretrained.zip savepath = /root/.paddleseg/tmp/tmpw33i0ru2/swinunet_pretrained.zip Connecting to https://paddleseg.bj.bcebos.com/paddleseg3d/synapse/abdomen/swinunet_abdomen_224_224_1_14k_5e-2/swinunet_pretrained.zip Downloading swinunet_pretrained.zip [==================================================] 100.00%


C++ Traceback (most recent call last):

0 inflateReset2


Error Message Summary:

FatalError: Segmentation fault is detected by the operating system. [TimeInfo: Aborted at 1716807751 (unix time) try "date -d @1716807751" if you are using GNU date ] [SignalInfo: SIGSEGV (@0xfffffffffffffff7) received by PID 24548 (TID 0x7fa9e9cea740) from PID 18446744073709551607 ]

段错误(吐核)

Bobholamovic commented 2 months ago

看起来是底层动态链接库的错误,为了得到更多的线索,建议尝试:

  1. 通过加断点、打印等手段确定代码最后调用的Python函数。
  2. 设置环境变量export GLOG_v=2,然后再次执行脚本,以获取C++层面的日志。