abanger commented 1 year ago

问题是在配置配置文件图像宽高调整（image_shape: [3, 350, 550]或image_shape: [3 550, 350,]）不影响训练结果为什么？

配置文件如下

`# global configs Global: checkpoints: null pretrained_model: null output_dir: ./output/ device: gpu save_interval: 1 eval_during_train: True eval_interval: 1 epochs: 120 print_batch_step: 10 use_visualdl: False

used for static mode and model export

image_shape: [3, 350, 550] save_inference_dir: ./inference

training model under @to_static

to_static: False

model architecture

Arch: name: ResNet50 class_num: 2

data loader for train and eval

DataLoader: Train: dataset: name: ImageNetDataset image_root: /data/bapps/dd/ cls_label_path: /data/bapps/dd/train_list.txt transform_ops:

DecodeImage: to_rgb: True channel_first: False
RandFlipImage: flip_code: 1
NormalizeImage: scale: 1.0/255.0 mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: '' Eval: dataset: name: ImageNetDataset image_root: /data/bapps/dd/ cls_label_path: /data/bapps/dd/val_list.txt transform_ops:
DecodeImage: to_rgb: True channel_first: False
ResizeImage: resize_short: 350
NormalizeImage: scale: 1.0/255.0 mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: ''

`

PaddleClas版本以及PaddlePaddle版本：Paddle2.4.2，PaddleClas2.4
训练环境信息： a. 具体操作系统，Linux b. Python版本号，Python 3.9.16 c. CUDA/cuDNN版本， CUDA10.2/cuDNN 7.6.5等

TingquanGao commented 1 year ago

Global. image_shape字段在训练时无效，只用于在将模型导出为inference model时候（tools/export_model.py）。训练时输入网络的图像尺寸是在DataLoader.Train.transform_ops中的RandCropImage（或使用ResizeImage）来定义的，比如：

https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/ppcls/configs/ImageNet/MobileNetV3/MobileNetV3_small_x1_0.yaml#L56-L57C22

abanger commented 1 year ago

Global. image_shape字段在训练时无效，只用于在将模型导出为inference model时候（tools/export_model.py）。训练时输入网络的图像尺寸是在DataLoader.Train.transform_ops中的RandCropImage（或使用ResizeImage）来定义的，比如：

https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/ppcls/configs/ImageNet/MobileNetV3/MobileNetV3_small_x1_0.yaml#L56-L57C22

好像不是您说的这样：使用RandCropImage（或使用ResizeImage）， ResizeImage输入只能一个参数：size: xxx。而且如vgg16等网络，需要改变网络结构参数才能使用。

改变后vgg16网络如下 `--------------------------------------------------------------------------- Layer (type) Input Shape Output Shape Param #

Conv2D-1 [[1, 3, 350, 550]] [1, 64, 350, 550] 1,728 ReLU-5 [[1, 64, 350, 550]] [1, 64, 350, 550] 0 Conv2D-2 [[1, 64, 350, 550]] [1, 64, 350, 550] 36,864 MaxPool2D-1 [[1, 64, 350, 550]] [1, 64, 175, 275] 0 ConvBlock-1 [[1, 3, 350, 550]] [1, 64, 175, 275] 0 Conv2D-3 [[1, 64, 175, 275]] [1, 128, 175, 275] 73,728 ReLU-6 [[1, 128, 175, 275]] [1, 128, 175, 275] 0 Conv2D-4 [[1, 128, 175, 275]] [1, 128, 175, 275] 147,456 MaxPool2D-2 [[1, 128, 175, 275]] [1, 128, 87, 137] 0 ConvBlock-2 [[1, 64, 175, 275]] [1, 128, 87, 137] 0 Conv2D-5 [[1, 128, 87, 137]] [1, 256, 87, 137] 294,912 ReLU-7 [[1, 256, 87, 137]] [1, 256, 87, 137] 0 Conv2D-6 [[1, 256, 87, 137]] [1, 256, 87, 137] 589,824 Conv2D-7 [[1, 256, 87, 137]] [1, 256, 87, 137] 589,824 MaxPool2D-3 [[1, 256, 87, 137]] [1, 256, 43, 68] 0 ConvBlock-3 [[1, 128, 87, 137]] [1, 256, 43, 68] 0 Conv2D-8 [[1, 256, 43, 68]] [1, 512, 43, 68] 1,179,648 ReLU-8 [[1, 512, 43, 68]] [1, 512, 43, 68] 0 Conv2D-9 [[1, 512, 43, 68]] [1, 512, 43, 68] 2,359,296 Conv2D-10 [[1, 512, 43, 68]] [1, 512, 43, 68] 2,359,296 MaxPool2D-4 [[1, 512, 43, 68]] [1, 512, 21, 34] 0 ConvBlock-4 [[1, 256, 43, 68]] [1, 512, 21, 34] 0 Conv2D-11 [[1, 512, 21, 34]] [1, 512, 21, 34] 2,359,296 ReLU-9 [[1, 512, 21, 34]] [1, 512, 21, 34] 0 Conv2D-12 [[1, 512, 21, 34]] [1, 512, 21, 34] 2,359,296 Conv2D-13 [[1, 512, 21, 34]] [1, 512, 21, 34] 2,359,296 MaxPool2D-5 [[1, 512, 21, 34]] [1, 512, 10, 17] 0 ConvBlock-5 [[1, 512, 21, 34]] [1, 512, 10, 17] 0 Flatten-1 [[1, 512, 10, 17]] [1, 87040] 0 Linear-1 [[1, 87040]] [1, 4096] 356,519,936 ReLU-10 [[1, 4096]] [1, 4096] 0 Dropout-1 [[1, 4096]] [1, 4096] 0 Linear-2 [[1, 4096]] [1, 4096] 16,781,312 Linear-3 [[1, 4096]] [1, 1000] 4,097,000 ===========================================================================`

PaddlePaddle / PaddleClas

配置文件图像宽高问题 #2866

used for static mode and model export

training model under @to_static

model architecture

data loader for train and eval

改变后vgg16网络如下 `--------------------------------------------------------------------------- Layer (type) Input Shape Output Shape Param #