PaddlePaddle / PaddleX

All-in-One Development Tool based on PaddlePaddle(飞桨低代码开发工具)
Apache License 2.0
4.94k stars 968 forks source link

目标识别检测模型FasterRCNN,图像预测是时偶发性报错 #1669

Open libingbingdev opened 1 year ago

libingbingdev commented 1 year ago

描述问题

采用FasterRCNN作为baseline进行目标检测模型训练,训练后部署到现场工控机上进行图像预测,大部分情况正常,每天都会遇到几次偶发性错误。

环境

1.windows10 企业版 /i7-9700 CPU /16G RAM/ 64位操作系统/2080Ti 2.python==3.9.7; paddlepaddle-gpu==2.2.1 paddlex==2.0.0

模型训练代码

train_transforms = transforms.Compose([
    transforms.RandomDistort(),
    transforms.RandomHorizontalFlip(),
    transforms.ResizeByShort(short_size=1024, max_size=2048),
    transforms.Normalize(),
])

eval_transforms = transforms.Compose([
    transforms.ResizeByShort(short_size=1024, max_size=2048),
    transforms.Normalize(),
])

root_path = 'Full'
train_dataset = pdx.datasets.VOCDetection(
    data_dir=root_path,
    file_list=os.path.join(root_path, 'train_list.txt'),
    label_list=os.path.join(root_path, 'labels.txt'),
    transforms=train_transforms,
    shuffle=True)
eval_dataset = pdx.datasets.VOCDetection(
    data_dir=root_path,
    file_list=os.path.join(root_path, 'val_list.txt'),
    label_list=os.path.join(root_path, 'labels.txt'),
    transforms=eval_transforms)

train_dataset.add_negative_samples(image_dir='Background')

num_classes = len(train_dataset.labels) + 1

 model = pdx.det.FasterRCNN(
     num_classes=num_classes,
     backbone='ResNet50_vd_ssld',
     with_dcn=True,
     fpn_num_channels=64,
     with_fpn=True,
     test_pre_nms_top_n=500,
     test_post_nms_top_n=300)

model.train(
    num_epochs=20,
    train_dataset=train_dataset,
    train_batch_size=4,
    eval_dataset=eval_dataset,
    save_interval_epochs=1,
    metric='VOC',
    learning_rate=0.01,
    lr_decay_epochs=[12, 16],
    warmup_steps=500,
    save_dir='Output/Full/faster_rcnn_r50_vd_dcn',
    use_vdl=True,
    early_stop=True)

导出模型

paddlex --export_inference --model_dir=Output/faster_rcnn_r50_vd_dcn/best_model --save_dir=Output/faster_rcnn_r50_vd_dcn/

模型预测代码

model = pdx.load_model(path_to_model) result = model.predict(image)

model.yml

Model: FasterRCNN Transforms:

错误信息

第一种: ERROR The dims of Inputs(Condition) and Inputs(X) should be same. But received Condition's shape is [3, 1], X's shape is [1, 1] [Hint: Expected cond_dims == x_dims, but received cond_dims:3, 1 != x_dims:1, 1.] (at C:/home/workspace/Paddle_release2/paddle/fluid/operators/where_op.cc:38) [operator < where > error]

第二种: ERROR The dims of Inputs(Condition) and Inputs(X) should be same. But received Condition's shape is [2, 1], X's shape is [1, 1] [Hint: Expected cond_dims == x_dims, but received cond_dims:2, 1 != x_dims:1, 1.] (at C:/home/workspace/Paddle_release2/paddle/fluid/operators/where_op.cc:38) [operator < where > error]

第三种: ERROR Dims of all Inputs(X) must be the same, but received input 1 dim is:1 not equal to input 0 dim:2.

[operator < stack > error]

第四种: ERROR Dims of all Inputs(X) must be the same, but received input 1 dim is:1 not equal to input 0 dim:4.

[operator < stack > error]

第五种: ERROR Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [5] and the shape of Y = [3]. Received [5] in X is not equal to [3] in Y at i:0. [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at C:\home\workspace\Paddle_release2\paddle/fluid/operators/elementwise/elementwise_op_function.h:169) [operator < elementwise_min > error]

lailuboy commented 1 year ago

错误来看每次输入上有问题,确认出错误的时刻与其他时刻给模型的输入是一样的吗?

libingbingdev commented 1 year ago

错误来看每次输入上有问题,确认出错误的时刻与其他时刻给模型的输入是一样的吗?

采用的是海康线阵相机进行在线触发拍照,每次输入模型的都是204810243 的图像; 报错时刻对应的图像有实时保存,跟正常情况下的图像是一致的。 模型每天运行大概一万多次,查看运行日志报错信息大概有七、八次。

lailuboy commented 1 year ago

看代码每次给模型的输入是image,这个就是你说的2048*1024 3通道的图像是吧? 实时保存是说每次预测前都会将输入保存成本地文件?然后出错时用保存的图像再加载预测是OK的是吧? 方便可以发一下模型预测前image的前处理代码以及出错时保存的图像。

libingbingdev commented 1 year ago

看代码每次给模型的输入是image,这个就是你说的2048*1024 3通道的图像是吧? 实时保存是说每次预测前都会将输入保存成本地文件?然后出错时用保存的图像再加载预测是OK的是吧? 方便可以发一下模型预测前image的前处理代码以及出错时保存的图像。

相机触发后会先将图像存储在本地,然后再去读取图像进行加载预测。 出错时保存的图像再加载预测是OK的,图像本身也没有问题。 模型预测前image的处理代码:

img= cv2.imread(imagepath) rows, cols, channels = img.shape black = np.zeros([rows, cols, channels], img.dtype) original = cv2.addWeighted(img, c, black, 1-c, b)

try:      result_full = predict.predict_img(self.model_full, original)      check_pic = visualize.visualize_detection(original.copy(), result_full, threshold=config.threshold,           save_dir=config.today.get_check_full_path()) except Exception as e:      log.error(e)      self.savePic(num, '00', 'kadun', original)

出错时存储的图片:链接:https://pan.baidu.com/s/1C7JkJ1R_TPTnj3fXrUc7tg 提取码:urru