PaddlePaddle / PaddleSeg

Easy-to-use image segmentation library with awesome pre-trained model zoo, supporting wide-range of practical tasks in Semantic Segmentation, Interactive Segmentation, Panoptic Segmentation, Image Matting, 3D Segmentation, etc.
https://arxiv.org/abs/2101.06175
Apache License 2.0
8.41k stars 1.66k forks source link

Cityscapes SOTA 模型导出报错'Config' object has no attribute 'model' #3358

Open Siiiiiigma opened 12 months ago

Siiiiiigma commented 12 months ago

问题确认 Search before asking

Bug描述 Describe the Bug

python export.py --config configs/mscale_ocr_cityscapes_autolabel_mapillary.yml --save_dir ./output --input_shape 1 3 2048 1024 按照readme要求在制定位置下载了模型参数和预训练参数,使用以上命令导出预训练的模型网络时,出现以下报错 尝试了历史issue中提到的几种方法,例如通过源码安装开发版paddleseg,问题仍然存在 利用飞浆ai studio的notebook也同样存在此问题,和配置环境应该无关

报错内容 d:\deeplearning\paddleseg\paddleseg\cvlibs\manager.py:113: UserWarning: MscaleOCRNet exists already! It is now updated to <class 'models.mscale_ocrnet.MscaleOCRNet'> !!! warnings.warn("{} exists already! It is now updated to {} !!!". Traceback (most recent call last): File "D:\DeepLearning\PaddleSeg\contrib\CityscapesSOTA\export.py", line 140, in main(args) File "D:\DeepLearning\PaddleSeg\contrib\CityscapesSOTA\export.py", line 84, in main net = cfg.model AttributeError: 'Config' object has no attribute 'model'

复现环境 Environment

paddlepaddle-gpu 2.4.2.post117 paddleseg 2.8.0 d:\deeplearning\paddleseg

Bug描述确认 Bug description confirmation

是否愿意提交PR? Are you willing to submit a PR?

Asthestarsfalll commented 12 months ago

@Siiiiiigma 你好,这应该是一个bug,问题在于CityscapesSOTA使用了paddleseg中的模块,而后续paddleseg更新时没有及时修改。可以尝试使用更早之前的版本,稍后我将会修复这个问题。

Asthestarsfalll commented 12 months ago

@Siiiiiigma 我已经提交了一个PR,你可以尝试克隆我的修改试试

Siiiiiigma commented 12 months ago

@Asthestarsfalll 感谢修复,我尝试导出第一个配置(mscale_ocr_cityscapes_autolabel_mapillary.yml)时,出现如下警告,请问是正常的吗? (Paddle) D:\DeepLearning\PaddleSeg\contrib\CityscapesSOTA>python export.py --config configs/mscale_ocr_cityscapes_autolabel_mapillary.yml --save_dir ./output --input_shape 1 3 2048 1024 d:\deeplearning\paddleseg\paddleseg\cvlibs\manager.py:113: UserWarning: MscaleOCRNet exists already! It is now updated to <class 'models.mscale_ocrnet.MscaleOCRNet'> !!! warnings.warn("{} exists already! It is now updated to {} !!!". 2023-07-10 16:43:06 [WARNING] Add the in_channels in train_dataset class to model config. We suggest you manually set in_channels in model config. 2023-07-10 16:43:06 [INFO] Use the following config to build model model: backbone: in_channels: 3 type: HRNet_W48_NV backbone_indices:

0 n_scales: 0.5 1.0 2.0 num_classes: 19 pretrained: pretrain/pretrained.pdparams type: MscaleOCRNet W0710 16:43:06.020490 7732 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 12.1, Runtime API Version: 11.7 W0710 16:43:06.046422 7732 gpu_resources.cc:91] device: 0, cuDNN Version: 8.4. 2023-07-10 16:43:10 [INFO] Loading pretrained model from pretrain/pretrained.pdparams 2023-07-10 16:43:13 [WARNING] [SKIP] Shape of pretrained params ocrnet.head.cls_head.weight doesn't match.(Pretrained: (65, 512, 1, 1), Actual: [19, 512, 1, 1]) 2023-07-10 16:43:13 [WARNING] [SKIP] Shape of pretrained params ocrnet.head.cls_head.bias doesn't match.(Pretrained: (65,), Actual: [19]) 2023-07-10 16:43:13 [WARNING] [SKIP] Shape of pretrained params ocrnet.head.aux_head.1.weight doesn't match.(Pretrained: (65, 720, 1, 1), Actual: [19, 720, 1, 1]) 2023-07-10 16:43:13 [WARNING] [SKIP] Shape of pretrained params ocrnet.head.aux_head.1.bias doesn't match.(Pretrained: (65,), Actual: [19]) 2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.0._conv.weight is not in pretrained model 2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.0._batch_norm.weight is not in pretrained model 2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.0._batch_norm.bias is not in pretrained model 2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.0._batch_norm._mean is not in pretrained model 2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.0._batch_norm._variance is not in pretrained model 2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.1._conv.weight is not in pretrained model 2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.1._batch_norm.weight is not in pretrained model 2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.1._batch_norm.bias is not in pretrained model 2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.1._batch_norm._mean is not in pretrained model 2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.1._batch_norm._variance is not in pretrained model 2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.2.weight is not in pretrained model 2023-07-10 16:43:14 [INFO] There are 1572/1587 variables loaded into MscaleOCRNet. 2023-07-10 16:43:48 [INFO] The inference model is saved in ./output

Asthestarsfalll commented 12 months ago

@Siiiiiigma 第一处警告是因为MscaleOCRNet在paddleseg.model中被注册过了,会在CityscapesSOTA重新注册一遍,没有影响。 第二处pretrained params是因为线性层的权重形状不一致,预训练的head通道数和微调不一致也很正常,没有影响。 第三处scale_attn的警告是因为你加载的是预训练权重,所以不存在scale_attn这个模块,deploy应该加载在下游任务训练好的权重。

Siiiiiigma commented 12 months ago

谢谢,明白了,修改为加载之前下载的saved_model/model.pdparams之后就没有警告了

Siiiiiigma commented 12 months ago

@Asthestarsfalll 你好,我想测试该模型在任意街景图上的效果,准备了一张2048*1024的JPG图像,放在image文件夹内,当我在飞桨ai studio的notebook下运行以下命令时: python deploy/python/infer.py \ --config /home/aistudio/PaddleSeg-2.6.0/output/deploy.yaml --image_path /home/aistudio/PaddleSeg-2.6.0/image --save_dir /home/aistudio/PaddleSeg-2.6.0/result

出现了如下报错: 2023-07-10 18:47:58 [INFO] Use GPU --- Running analysis [ir_graph_build_pass] I0710 18:48:00.875998 2513 executor.cc:187] Old Executor is Running. --- Running analysis [ir_analysis_pass] --- Running IR pass [map_op_to_another_pass] --- Running IR pass [identity_scale_op_clean_pass] --- Running IR pass [is_test_pass] --- Running IR pass [simplify_with_basic_ops_pass] --- Running IR pass [delete_quant_dequant_linear_op_pass] --- Running IR pass [delete_weight_dequant_linear_op_pass] --- Running IR pass [constant_folding_pass] --- Running IR pass [silu_fuse_pass] --- Running IR pass [conv_bn_fuse_pass] --- Running IR pass [conv_eltwiseadd_bn_fuse_pass] --- Running IR pass [embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [multihead_matmul_fuse_pass_v2] --- Running IR pass [vit_attention_fuse_pass] --- Running IR pass [fused_multi_transformer_encoder_pass] --- Running IR pass [fused_multi_transformer_decoder_pass] --- Running IR pass [fused_multi_transformer_encoder_fuse_qkv_pass] --- Running IR pass [fused_multi_transformer_decoder_fuse_qkv_pass] --- Running IR pass [multi_devices_fused_multi_transformer_encoder_pass] --- Running IR pass [multi_devices_fused_multi_transformer_encoder_fuse_qkv_pass] --- Running IR pass [multi_devices_fused_multi_transformer_decoder_fuse_qkv_pass] --- Running IR pass [fuse_multi_transformer_layer_pass] --- Running IR pass [gpu_cpu_squeeze2_matmul_fuse_pass] --- Running IR pass [gpu_cpu_reshape2_matmul_fuse_pass] --- Running IR pass [gpu_cpu_flatten2_matmul_fuse_pass] --- Running IR pass [gpu_cpu_map_matmul_v2_to_mul_pass] --- Running IR pass [gpu_cpu_map_matmul_v2_to_matmul_pass] --- Running IR pass [matmul_scale_fuse_pass] --- Running IR pass [multihead_matmul_fuse_pass_v3] --- Running IR pass [gpu_cpu_map_matmul_to_mul_pass] --- Running IR pass [fc_fuse_pass] --- Running IR pass [fc_elementwise_layernorm_fuse_pass] --- Running IR pass [conv_elementwise_add_act_fuse_pass] --- Running IR pass [conv_elementwise_add2_act_fuse_pass] --- Running IR pass [conv_elementwise_add_fuse_pass] I0710 18:48:47.483732 2513 fuse_pass_base.cc:59] --- detected 12 subgraphs --- Running IR pass [transpose_flatten_concat_fuse_pass] --- Running IR pass [conv2d_fusion_layout_transfer_pass] --- Running IR pass [transfer_layout_elim_pass] --- Running IR pass [auto_mixed_precision_pass] --- Running IR pass [inplace_op_var_pass] I0710 18:48:47.669679 2513 fuse_pass_base.cc:59] --- detected 3 subgraphs --- Running analysis [save_optimized_model_pass] W0710 18:48:47.685402 2513 save_optimized_model_pass.cc:28] save_optim_cache_model is turned off, skip save_optimized_model_pass --- Running analysis [ir_params_sync_among_devices_pass] I0710 18:48:47.685453 2513 ir_params_sync_among_devices_pass.cc:51] Sync params from CPU to GPU --- Running analysis [adjust_cudnn_workspace_size_pass] --- Running analysis [inference_op_replace_pass] --- Running analysis [memory_optimize_pass] I0710 18:48:50.664584 2513 memory_optimize_pass.cc:222] Cluster name : shape_28.tmp_0_slice_0 size: 8 I0710 18:48:50.664654 2513 memory_optimize_pass.cc:222] Cluster name : shape_0.tmp_0_slice_0 size: 8 I0710 18:48:50.664659 2513 memory_optimize_pass.cc:222] Cluster name : concat_1.tmp_0 size: -2147483648 I0710 18:48:50.664661 2513 memory_optimize_pass.cc:222] Cluster name : transpose_0.tmp_0 size: 1073741824 I0710 18:48:50.664664 2513 memory_optimize_pass.cc:222] Cluster name : relu_78.tmp_0 size: 50331648 I0710 18:48:50.664673 2513 memory_optimize_pass.cc:222] Cluster name : batch_norm_305.tmp_2 size: 1509949440 I0710 18:48:50.664676 2513 memory_optimize_pass.cc:222] Cluster name : batch_norm_196.tmp_2 size: 50331648 I0710 18:48:50.664680 2513 memory_optimize_pass.cc:222] Cluster name : relu_227.tmp_0 size: 12582912 I0710 18:48:50.664685 2513 memory_optimize_pass.cc:222] Cluster name : batch_norm_200.tmp_2 size: 25165824 I0710 18:48:50.664688 2513 memory_optimize_pass.cc:222] Cluster name : x size: 25165824 I0710 18:48:50.664702 2513 memory_optimize_pass.cc:222] Cluster name : relu_171.tmp_0 size: 25165824 I0710 18:48:50.664711 2513 memory_optimize_pass.cc:222] Cluster name : batch_norm_930.tmp_1 size: 768 I0710 18:48:50.664716 2513 memory_optimize_pass.cc:222] Cluster name : concat_0.tmp_0 size: 1509949440 I0710 18:48:50.664718 2513 memory_optimize_pass.cc:222] Cluster name : tmp_310 size: 3145728 I0710 18:48:50.664721 2513 memory_optimize_pass.cc:222] Cluster name : bilinear_interp_v2_35.tmp_0 size: 76 --- Running analysis [ir_graph_to_program_pass] I0710 18:48:51.751169 2513 analysis_predictor.cc:1660] ======= optimize end ======= I0710 18:48:51.776242 2513 naive_executor.cc:164] --- skip [feed], feed -> x I0710 18:48:51.808507 2513 naive_executor.cc:164] --- skip [argmax_0.tmp_0], fetch -> fetch W0710 18:48:51.966293 2513 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.6 W0710 18:48:51.974512 2513 gpu_resources.cc:149] device: 0, cuDNN Version: 8.4. Traceback (most recent call last): File "/home/aistudio/PaddleSeg-2.6.0/deploy/python/infer.py", line 430, in main(args) File "/home/aistudio/PaddleSeg-2.6.0/deploy/python/infer.py", line 418, in main predictor.run(imgs_list) File "/home/aistudio/PaddleSeg-2.6.0/deploy/python/infer.py", line 375, in run self.predictor.run() ValueError: (InvalidArgument) The 2-th dimension of input[0] and input[1] is expected to be equal.But received input[0]'s shape = [1, 512, 1024, 512], input[1]'s shape = [1, 512, 512, 1024].

[operator < concat > error]

请问是我输入数据的形状问题吗,还是模型的问题?

Asthestarsfalll commented 12 months ago

@Siiiiiigma 应该是输入数据的形状问题

Siiiiiigma commented 12 months ago

https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.8/docs/deployment/inference/python_inference_cn.md 我使用该链接提供的cityscapes_demo.png仍然报同样的问题,感觉不像是形状的问题,是我漏了什么预处理步骤吗

Asthestarsfalll commented 12 months ago

https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.8/docs/deployment/inference/python_inference_cn.md 我使用该链接提供的cityscapes_demo.png仍然报同样的问题,感觉不像是形状的问题,是我漏了什么预处理步骤吗

看报错是模型内部concat时tensor形状不一样,使用develop分支试试呢?

Siiiiiigma commented 11 months ago

@Asthestarsfalll 我在本地使用了源码安装的开发者版本(2.8.0),以及在ai studio使用notebook提供的2.6.0版本,且均使用cityscapes_demo.png测试,该问题仍然存在,报错位置相同,请检查一下模型内部是否存在bug