markshih91 / mobilenet_v2_ssdlite_keras

A keras version of real-time object detection network : mobilenet_v2_ssdlite
80 stars 40 forks source link

Predictions seem to be off even when the loss / val-loss is low #4

Open mallochio opened 4 years ago

mallochio commented 4 years ago

Hi @markshih91

I trained your model on a toy dataset where the task was just to predict the edges of randomly generated black rectangles on a white background, something which should be a simple task.

The loss and val-loss seem to be low here but the predictions seem to be very much off (Negative values in the predictions seem to be the norm, which I've truncated). Is this because of an error in the decoding layer?

This is a snapshot of the loss. loss

The predictions (seen in red) - predictions

My model is configured to train using the following configuration

aspect_ratios_per_layer = [[1.0, 2.0, 0.5], 
                           [1.0, 2.0, 0.5, 3.0, 1.0 / 3.0], 
                           [1.0, 2.0, 0.5, 3.0, 1.0 / 3.0],
                           [1.0, 2.0, 0.5, 3.0, 1.0 / 3.0], 
                           [1.0, 2.0, 0.5],
                           [1.0, 2.0, 0.5]]

predictor_sizes = [model.get_layer('ssd_cls1conv2_bn').output_shape[1:3],
                   model.get_layer('ssd_cls2conv2_bn').output_shape[1:3],
                   model.get_layer('ssd_cls3conv2_bn').output_shape[1:3],
                   model.get_layer('ssd_cls4conv2_bn').output_shape[1:3],
                   model.get_layer('ssd_cls5conv2_bn').output_shape[1:3],
                   model.get_layer('ssd_cls6conv2_bn').output_shape[1:3]]
image_size=(416,416,3)
n_classes=1
min_scale=0.3 
max_scale=0.9
normalize_coords=False
subtract_mean=None
divide_by_stddev=None
swap_channels=None #[2,1,0],
confidence_thresh=0.01
iou_threshold=0.45
top_k=200
scales = None
aspect_ratios_global = None
return_predictor_sizes = False
two_boxes_for_ar1=True
steps=None
offsets=None
clip_boxes=True
variances=[0.1, 0.1, 0.2, 0.2]
matching_type='multi'
pos_iou_threshold=0.45
neg_iou_limit=0.3
model = mobilenet_v2_ssd(
    image_size=image_size,
    n_classes=n_classes,
    min_scale=min_scale, 
    max_scale=max_scale,
    coords='corners',
    mode='training',
    clip_boxes=clip_boxes,
    normalize_coords=normalize_coords,
    subtract_mean=subtract_mean,
    divide_by_stddev=divide_by_stddev,
    swap_channels=swap_channels,
    confidence_thresh=confidence_thresh,
    iou_threshold=iou_threshold,
    top_k=top_k
)

And it predicts using similar configurations, except with top_k=1, coords='centroids', mode='inference'

I've trained with a large-ish dataset composed of around 3000 images for the training set and 500 images on the validation set. The predictions you see are on the training set itself which I expected to be better due to the reportedly low loss. The rest of the training config is similar to that of the example posted. The ground truth is of the format image-name, class, xmin, ymin, xmax, ymax

Thanks in advance!

liguilan1227 commented 4 years ago

嗨@ markshih91

我在一个玩具数据集上训练了您的模型,该任务只是预测白色背景上随机生成的黑色矩形的边缘,这应该是一个简单的任务。

此处的损耗和val损耗似乎很低,但预测似乎相差甚远(预测中的负值似乎是常态,我已将其截断了)。这是因为解码层中的错误吗?

这是损失的快照。 失利

预测(以红色显示)- 预测

我的模型配置为使用以下配置进行训练

aspect_ratios_per_layer = [[1.0, 2.0, 0.5], 
                           [1.0, 2.0, 0.5, 3.0, 1.0 / 3.0], 
                           [1.0, 2.0, 0.5, 3.0, 1.0 / 3.0],
                           [1.0, 2.0, 0.5, 3.0, 1.0 / 3.0], 
                           [1.0, 2.0, 0.5],
                           [1.0, 2.0, 0.5]]

predictor_sizes = [model.get_layer('ssd_cls1conv2_bn').output_shape[1:3],
                   model.get_layer('ssd_cls2conv2_bn').output_shape[1:3],
                   model.get_layer('ssd_cls3conv2_bn').output_shape[1:3],
                   model.get_layer('ssd_cls4conv2_bn').output_shape[1:3],
                   model.get_layer('ssd_cls5conv2_bn').output_shape[1:3],
                   model.get_layer('ssd_cls6conv2_bn').output_shape[1:3]]
image_size=(416,416,3)
n_classes=1
min_scale=0.3 
max_scale=0.9
normalize_coords=False
subtract_mean=None
divide_by_stddev=None
swap_channels=None #[2,1,0],
confidence_thresh=0.01
iou_threshold=0.45
top_k=200
scales = None
aspect_ratios_global = None
return_predictor_sizes = False
two_boxes_for_ar1=True
steps=None
offsets=None
clip_boxes=True
variances=[0.1, 0.1, 0.2, 0.2]
matching_type='multi'
pos_iou_threshold=0.45
neg_iou_limit=0.3
model = mobilenet_v2_ssd(
    image_size=image_size,
    n_classes=n_classes,
    min_scale=min_scale, 
    max_scale=max_scale,
    coords='corners',
    mode='training',
    clip_boxes=clip_boxes,
    normalize_coords=normalize_coords,
    subtract_mean=subtract_mean,
    divide_by_stddev=divide_by_stddev,
    swap_channels=swap_channels,
    confidence_thresh=confidence_thresh,
    iou_threshold=iou_threshold,
    top_k=top_k
)

它预测使用类似的配置,除了 top_k=1, coords='centroids', mode='inference'

我已经训练了一个庞大的数据集,该数据集包含约3000张针对训练集的图像和500张验证集的图像。您看到的预测是在训练集本身上的,由于报告的损失很低,我希望它会更好。其余的培训配置与发布的示例类似。基本事实的格式image-name, class, xmin, ymin, xmax, ymax

提前致谢! Excuse me, i run this code with the error ' No module named 'models'', but acturally the dictory named models is exsist in the project, how did you solve it? could you help me?