fizyr / keras-maskrcnn

Keras implementation of MaskRCNN object detection.
Apache License 2.0
406 stars 131 forks source link

Weird losses behaviour #99

Closed tpostadjian closed 4 years ago

tpostadjian commented 4 years ago

Hey,

I am trying to train Mask-RCNN with my own dataset. So far, I managed to preprocess all data so as to wrap them through the CSV generator.

The backbone is ResNet50, I froze its layers in order to reduce training time a bit. Learning rate is set to 1e-5, although I tried several rates for a few iterations but the behaviour is still as following:

  125/14863 [..............................] - ETA: 8:13:25 - loss: 4.8801 - regression_loss: 2.9342 - classification_loss: 1.9459 - masks_loss: 0.0000e+00
  126/14863 [..............................] - ETA: 8:12:38 - loss: 4.8781 - regression_loss: 2.9334 - classification_loss: 1.9447 - masks_loss: 0.0000e+00
  127/14863 [..............................] - ETA: 8:11:53 - loss: 4.8787 - regression_loss: 2.9330 - classification_loss: 1.9457 - masks_loss: 0.0000e+00
  128/14863 [..............................] - ETA: 8:11:07 - loss: 4.8766 - regression_loss: 2.9337 - classification_loss: 1.9429 - masks_loss: 0.0000e+00
  129/14863 [..............................] - ETA: 8:10:21 - loss: 4.8777 - regression_loss: 2.9337 - classification_loss: 1.9441 - masks_loss: 0.0000e+00
  130/14863 [..............................] - ETA: 8:09:37 - loss: 4.8793 - regression_loss: 2.9341 - classification_loss: 1.9452 - masks_loss: 0.0000e+00
  131/14863 [..............................] - ETA: 8:08:52 - loss: 4.8807 - regression_loss: 2.9329 - classification_loss: 1.9477 - masks_loss: 0.0000e+00
  132/14863 [..............................] - ETA: 8:08:10 - loss: 4.8768 - regression_loss: 2.9329 - classification_loss: 1.9439 - masks_loss: 0.0000e+00
  133/14863 [..............................] - ETA: 8:07:27 - loss: 4.8769 - regression_loss: 2.9303 - classification_loss: 1.9467 - masks_loss: 0.0000e+00
  134/14863 [..............................] - ETA: 8:06:45 - loss: 4.8781 - regression_loss: 2.9298 - classification_loss: 1.9483 - masks_loss: 0.0000e+00
  135/14863 [..............................] - ETA: 8:06:04 - loss: 4.8743 - regression_loss: 2.9288 - classification_loss: 1.9455 - masks_loss: 0.0000e+00
  136/14863 [..............................] - ETA: 8:05:16 - loss: 4.8751 - regression_loss: 2.9288 - classification_loss: 1.9463 - masks_loss: 0.0000e+00
  137/14863 [..............................] - ETA: 8:04:37 - loss: 4.8772 - regression_loss: 2.9285 - classification_loss: 1.9487 - masks_loss: 0.0000e+00
  138/14863 [..............................] - ETA: 8:03:55 - loss: 4.8772 - regression_loss: 2.9280 - classification_loss: 1.9492 - masks_loss: 0.0000e+00
  139/14863 [..............................] - ETA: 8:03:16 - loss: 4.8742 - regression_loss: 2.9287 - classification_loss: 1.9455 - masks_loss: 0.0000e+00
  140/14863 [..............................] - ETA: 8:02:41 - loss: 4.8738 - regression_loss: 2.9270 - classification_loss: 1.9469 - masks_loss: 0.0000e+00
  141/14863 [..............................] - ETA: 8:02:06 - loss: 4.8734 - regression_loss: 2.9268 - classification_loss: 1.9466 - masks_loss: 0.0000e+00
  142/14863 [..............................] - ETA: 8:01:30 - loss: 4.8737 - regression_loss: 2.9276 - classification_loss: 1.9461 - masks_loss: 0.0000e+00

None of the three first losses will decrease and the masks_loss is stuck at 0. I cannot understand where this comes from.

I have double checked data with the debug script and they present the correct format, with masks well aligned with the objects, and the set of anchors well placed for each object.

vsuryamurthy commented 4 years ago

How do you freeze the weights? Can you send the model.summary()?

tpostadjian commented 4 years ago

Backbone is frozen through the freeze_model argument passed to create_models, which goes back to keras_retinanet package. It is supposed to only freeze the detection part (retinanet) right ?

Summary is below and highlights that we are actually halfing the number of trainable parameters from this layer freezing.

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
image (InputLayer)              (None, None, None, 3 0                                            
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, None, None, 6 9408        image[0][0]                      
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, None, None, 6 256         conv1[0][0]                      
__________________________________________________________________________________________________
conv1_relu (Activation)         (None, None, None, 6 0           bn_conv1[0][0]                   
__________________________________________________________________________________________________
pool1 (MaxPooling2D)            (None, None, None, 6 0           conv1_relu[0][0]                 
__________________________________________________________________________________________________
res2a_branch2a (Conv2D)         (None, None, None, 6 4096        pool1[0][0]                      
__________________________________________________________________________________________________
bn2a_branch2a (BatchNormalizati (None, None, None, 6 256         res2a_branch2a[0][0]             
__________________________________________________________________________________________________
res2a_branch2a_relu (Activation (None, None, None, 6 0           bn2a_branch2a[0][0]              
__________________________________________________________________________________________________
padding2a_branch2b (ZeroPadding (None, None, None, 6 0           res2a_branch2a_relu[0][0]        
__________________________________________________________________________________________________
res2a_branch2b (Conv2D)         (None, None, None, 6 36864       padding2a_branch2b[0][0]         
__________________________________________________________________________________________________
bn2a_branch2b (BatchNormalizati (None, None, None, 6 256         res2a_branch2b[0][0]             
__________________________________________________________________________________________________
res2a_branch2b_relu (Activation (None, None, None, 6 0           bn2a_branch2b[0][0]              
__________________________________________________________________________________________________
res2a_branch2c (Conv2D)         (None, None, None, 2 16384       res2a_branch2b_relu[0][0]        
__________________________________________________________________________________________________
res2a_branch1 (Conv2D)          (None, None, None, 2 16384       pool1[0][0]                      
__________________________________________________________________________________________________
bn2a_branch2c (BatchNormalizati (None, None, None, 2 1024        res2a_branch2c[0][0]             
__________________________________________________________________________________________________
bn2a_branch1 (BatchNormalizatio (None, None, None, 2 1024        res2a_branch1[0][0]              
__________________________________________________________________________________________________
res2a (Add)                     (None, None, None, 2 0           bn2a_branch2c[0][0]              
                                                                 bn2a_branch1[0][0]               
__________________________________________________________________________________________________
res2a_relu (Activation)         (None, None, None, 2 0           res2a[0][0]                      
__________________________________________________________________________________________________
res2b_branch2a (Conv2D)         (None, None, None, 6 16384       res2a_relu[0][0]                 
__________________________________________________________________________________________________
bn2b_branch2a (BatchNormalizati (None, None, None, 6 256         res2b_branch2a[0][0]             
__________________________________________________________________________________________________
res2b_branch2a_relu (Activation (None, None, None, 6 0           bn2b_branch2a[0][0]              
__________________________________________________________________________________________________
padding2b_branch2b (ZeroPadding (None, None, None, 6 0           res2b_branch2a_relu[0][0]        
__________________________________________________________________________________________________
res2b_branch2b (Conv2D)         (None, None, None, 6 36864       padding2b_branch2b[0][0]         
__________________________________________________________________________________________________
bn2b_branch2b (BatchNormalizati (None, None, None, 6 256         res2b_branch2b[0][0]             
__________________________________________________________________________________________________
res2b_branch2b_relu (Activation (None, None, None, 6 0           bn2b_branch2b[0][0]              
__________________________________________________________________________________________________
res2b_branch2c (Conv2D)         (None, None, None, 2 16384       res2b_branch2b_relu[0][0]        
__________________________________________________________________________________________________
bn2b_branch2c (BatchNormalizati (None, None, None, 2 1024        res2b_branch2c[0][0]             
__________________________________________________________________________________________________
res2b (Add)                     (None, None, None, 2 0           bn2b_branch2c[0][0]              
                                                                 res2a_relu[0][0]                 
__________________________________________________________________________________________________
res2b_relu (Activation)         (None, None, None, 2 0           res2b[0][0]                      
__________________________________________________________________________________________________
res2c_branch2a (Conv2D)         (None, None, None, 6 16384       res2b_relu[0][0]                 
__________________________________________________________________________________________________
bn2c_branch2a (BatchNormalizati (None, None, None, 6 256         res2c_branch2a[0][0]             
__________________________________________________________________________________________________
res2c_branch2a_relu (Activation (None, None, None, 6 0           bn2c_branch2a[0][0]              
__________________________________________________________________________________________________
padding2c_branch2b (ZeroPadding (None, None, None, 6 0           res2c_branch2a_relu[0][0]        
__________________________________________________________________________________________________
res2c_branch2b (Conv2D)         (None, None, None, 6 36864       padding2c_branch2b[0][0]         
__________________________________________________________________________________________________
bn2c_branch2b (BatchNormalizati (None, None, None, 6 256         res2c_branch2b[0][0]             
__________________________________________________________________________________________________
res2c_branch2b_relu (Activation (None, None, None, 6 0           bn2c_branch2b[0][0]              
__________________________________________________________________________________________________
res2c_branch2c (Conv2D)         (None, None, None, 2 16384       res2c_branch2b_relu[0][0]        
__________________________________________________________________________________________________
bn2c_branch2c (BatchNormalizati (None, None, None, 2 1024        res2c_branch2c[0][0]             
__________________________________________________________________________________________________
res2c (Add)                     (None, None, None, 2 0           bn2c_branch2c[0][0]              
                                                                 res2b_relu[0][0]                 
__________________________________________________________________________________________________
res2c_relu (Activation)         (None, None, None, 2 0           res2c[0][0]                      
__________________________________________________________________________________________________
res3a_branch2a (Conv2D)         (None, None, None, 1 32768       res2c_relu[0][0]                 
__________________________________________________________________________________________________
bn3a_branch2a (BatchNormalizati (None, None, None, 1 512         res3a_branch2a[0][0]             
__________________________________________________________________________________________________
res3a_branch2a_relu (Activation (None, None, None, 1 0           bn3a_branch2a[0][0]              
__________________________________________________________________________________________________
padding3a_branch2b (ZeroPadding (None, None, None, 1 0           res3a_branch2a_relu[0][0]        
__________________________________________________________________________________________________
res3a_branch2b (Conv2D)         (None, None, None, 1 147456      padding3a_branch2b[0][0]         
__________________________________________________________________________________________________
bn3a_branch2b (BatchNormalizati (None, None, None, 1 512         res3a_branch2b[0][0]             
__________________________________________________________________________________________________
res3a_branch2b_relu (Activation (None, None, None, 1 0           bn3a_branch2b[0][0]              
__________________________________________________________________________________________________
res3a_branch2c (Conv2D)         (None, None, None, 5 65536       res3a_branch2b_relu[0][0]        
__________________________________________________________________________________________________
res3a_branch1 (Conv2D)          (None, None, None, 5 131072      res2c_relu[0][0]                 
__________________________________________________________________________________________________
bn3a_branch2c (BatchNormalizati (None, None, None, 5 2048        res3a_branch2c[0][0]             
__________________________________________________________________________________________________
bn3a_branch1 (BatchNormalizatio (None, None, None, 5 2048        res3a_branch1[0][0]              
__________________________________________________________________________________________________
res3a (Add)                     (None, None, None, 5 0           bn3a_branch2c[0][0]              
                                                                 bn3a_branch1[0][0]               
__________________________________________________________________________________________________
res3a_relu (Activation)         (None, None, None, 5 0           res3a[0][0]                      
__________________________________________________________________________________________________
res3b_branch2a (Conv2D)         (None, None, None, 1 65536       res3a_relu[0][0]                 
__________________________________________________________________________________________________
bn3b_branch2a (BatchNormalizati (None, None, None, 1 512         res3b_branch2a[0][0]             
__________________________________________________________________________________________________
res3b_branch2a_relu (Activation (None, None, None, 1 0           bn3b_branch2a[0][0]              
__________________________________________________________________________________________________
padding3b_branch2b (ZeroPadding (None, None, None, 1 0           res3b_branch2a_relu[0][0]        
__________________________________________________________________________________________________
res3b_branch2b (Conv2D)         (None, None, None, 1 147456      padding3b_branch2b[0][0]         
__________________________________________________________________________________________________
bn3b_branch2b (BatchNormalizati (None, None, None, 1 512         res3b_branch2b[0][0]             
__________________________________________________________________________________________________
res3b_branch2b_relu (Activation (None, None, None, 1 0           bn3b_branch2b[0][0]              
__________________________________________________________________________________________________
res3b_branch2c (Conv2D)         (None, None, None, 5 65536       res3b_branch2b_relu[0][0]        
__________________________________________________________________________________________________
bn3b_branch2c (BatchNormalizati (None, None, None, 5 2048        res3b_branch2c[0][0]             
__________________________________________________________________________________________________
res3b (Add)                     (None, None, None, 5 0           bn3b_branch2c[0][0]              
                                                                 res3a_relu[0][0]                 
__________________________________________________________________________________________________
res3b_relu (Activation)         (None, None, None, 5 0           res3b[0][0]                      
__________________________________________________________________________________________________
res3c_branch2a (Conv2D)         (None, None, None, 1 65536       res3b_relu[0][0]                 
__________________________________________________________________________________________________
bn3c_branch2a (BatchNormalizati (None, None, None, 1 512         res3c_branch2a[0][0]             
__________________________________________________________________________________________________
res3c_branch2a_relu (Activation (None, None, None, 1 0           bn3c_branch2a[0][0]              
__________________________________________________________________________________________________
padding3c_branch2b (ZeroPadding (None, None, None, 1 0           res3c_branch2a_relu[0][0]        
__________________________________________________________________________________________________
res3c_branch2b (Conv2D)         (None, None, None, 1 147456      padding3c_branch2b[0][0]         
__________________________________________________________________________________________________
bn3c_branch2b (BatchNormalizati (None, None, None, 1 512         res3c_branch2b[0][0]             
__________________________________________________________________________________________________
res3c_branch2b_relu (Activation (None, None, None, 1 0           bn3c_branch2b[0][0]              
__________________________________________________________________________________________________
res3c_branch2c (Conv2D)         (None, None, None, 5 65536       res3c_branch2b_relu[0][0]        
__________________________________________________________________________________________________
bn3c_branch2c (BatchNormalizati (None, None, None, 5 2048        res3c_branch2c[0][0]             
__________________________________________________________________________________________________
res3c (Add)                     (None, None, None, 5 0           bn3c_branch2c[0][0]              
                                                                 res3b_relu[0][0]                 
__________________________________________________________________________________________________
res3c_relu (Activation)         (None, None, None, 5 0           res3c[0][0]                      
__________________________________________________________________________________________________
res3d_branch2a (Conv2D)         (None, None, None, 1 65536       res3c_relu[0][0]                 
__________________________________________________________________________________________________
bn3d_branch2a (BatchNormalizati (None, None, None, 1 512         res3d_branch2a[0][0]             
__________________________________________________________________________________________________
res3d_branch2a_relu (Activation (None, None, None, 1 0           bn3d_branch2a[0][0]              
__________________________________________________________________________________________________
padding3d_branch2b (ZeroPadding (None, None, None, 1 0           res3d_branch2a_relu[0][0]        
__________________________________________________________________________________________________
res3d_branch2b (Conv2D)         (None, None, None, 1 147456      padding3d_branch2b[0][0]         
__________________________________________________________________________________________________
bn3d_branch2b (BatchNormalizati (None, None, None, 1 512         res3d_branch2b[0][0]             
__________________________________________________________________________________________________
res3d_branch2b_relu (Activation (None, None, None, 1 0           bn3d_branch2b[0][0]              
__________________________________________________________________________________________________
res3d_branch2c (Conv2D)         (None, None, None, 5 65536       res3d_branch2b_relu[0][0]        
__________________________________________________________________________________________________
bn3d_branch2c (BatchNormalizati (None, None, None, 5 2048        res3d_branch2c[0][0]             
__________________________________________________________________________________________________
res3d (Add)                     (None, None, None, 5 0           bn3d_branch2c[0][0]              
                                                                 res3c_relu[0][0]                 
__________________________________________________________________________________________________
res3d_relu (Activation)         (None, None, None, 5 0           res3d[0][0]                      
__________________________________________________________________________________________________
res4a_branch2a (Conv2D)         (None, None, None, 2 131072      res3d_relu[0][0]                 
__________________________________________________________________________________________________
bn4a_branch2a (BatchNormalizati (None, None, None, 2 1024        res4a_branch2a[0][0]             
__________________________________________________________________________________________________
res4a_branch2a_relu (Activation (None, None, None, 2 0           bn4a_branch2a[0][0]              
__________________________________________________________________________________________________
padding4a_branch2b (ZeroPadding (None, None, None, 2 0           res4a_branch2a_relu[0][0]        
__________________________________________________________________________________________________
res4a_branch2b (Conv2D)         (None, None, None, 2 589824      padding4a_branch2b[0][0]         
__________________________________________________________________________________________________
bn4a_branch2b (BatchNormalizati (None, None, None, 2 1024        res4a_branch2b[0][0]             
__________________________________________________________________________________________________
res4a_branch2b_relu (Activation (None, None, None, 2 0           bn4a_branch2b[0][0]              
__________________________________________________________________________________________________
res4a_branch2c (Conv2D)         (None, None, None, 1 262144      res4a_branch2b_relu[0][0]        
__________________________________________________________________________________________________
res4a_branch1 (Conv2D)          (None, None, None, 1 524288      res3d_relu[0][0]                 
__________________________________________________________________________________________________
bn4a_branch2c (BatchNormalizati (None, None, None, 1 4096        res4a_branch2c[0][0]             
__________________________________________________________________________________________________
bn4a_branch1 (BatchNormalizatio (None, None, None, 1 4096        res4a_branch1[0][0]              
__________________________________________________________________________________________________
res4a (Add)                     (None, None, None, 1 0           bn4a_branch2c[0][0]              
                                                                 bn4a_branch1[0][0]               
__________________________________________________________________________________________________
res4a_relu (Activation)         (None, None, None, 1 0           res4a[0][0]                      
__________________________________________________________________________________________________
res4b_branch2a (Conv2D)         (None, None, None, 2 262144      res4a_relu[0][0]                 
__________________________________________________________________________________________________
bn4b_branch2a (BatchNormalizati (None, None, None, 2 1024        res4b_branch2a[0][0]             
__________________________________________________________________________________________________
res4b_branch2a_relu (Activation (None, None, None, 2 0           bn4b_branch2a[0][0]              
__________________________________________________________________________________________________
padding4b_branch2b (ZeroPadding (None, None, None, 2 0           res4b_branch2a_relu[0][0]        
__________________________________________________________________________________________________
res4b_branch2b (Conv2D)         (None, None, None, 2 589824      padding4b_branch2b[0][0]         
__________________________________________________________________________________________________
bn4b_branch2b (BatchNormalizati (None, None, None, 2 1024        res4b_branch2b[0][0]             
__________________________________________________________________________________________________
res4b_branch2b_relu (Activation (None, None, None, 2 0           bn4b_branch2b[0][0]              
__________________________________________________________________________________________________
res4b_branch2c (Conv2D)         (None, None, None, 1 262144      res4b_branch2b_relu[0][0]        
__________________________________________________________________________________________________
bn4b_branch2c (BatchNormalizati (None, None, None, 1 4096        res4b_branch2c[0][0]             
__________________________________________________________________________________________________
res4b (Add)                     (None, None, None, 1 0           bn4b_branch2c[0][0]              
                                                                 res4a_relu[0][0]                 
__________________________________________________________________________________________________
res4b_relu (Activation)         (None, None, None, 1 0           res4b[0][0]                      
__________________________________________________________________________________________________
res4c_branch2a (Conv2D)         (None, None, None, 2 262144      res4b_relu[0][0]                 
__________________________________________________________________________________________________
bn4c_branch2a (BatchNormalizati (None, None, None, 2 1024        res4c_branch2a[0][0]             
__________________________________________________________________________________________________
res4c_branch2a_relu (Activation (None, None, None, 2 0           bn4c_branch2a[0][0]              
__________________________________________________________________________________________________
padding4c_branch2b (ZeroPadding (None, None, None, 2 0           res4c_branch2a_relu[0][0]        
__________________________________________________________________________________________________
res4c_branch2b (Conv2D)         (None, None, None, 2 589824      padding4c_branch2b[0][0]         
__________________________________________________________________________________________________
bn4c_branch2b (BatchNormalizati (None, None, None, 2 1024        res4c_branch2b[0][0]             
__________________________________________________________________________________________________
res4c_branch2b_relu (Activation (None, None, None, 2 0           bn4c_branch2b[0][0]              
__________________________________________________________________________________________________
res4c_branch2c (Conv2D)         (None, None, None, 1 262144      res4c_branch2b_relu[0][0]        
__________________________________________________________________________________________________
bn4c_branch2c (BatchNormalizati (None, None, None, 1 4096        res4c_branch2c[0][0]             
__________________________________________________________________________________________________
res4c (Add)                     (None, None, None, 1 0           bn4c_branch2c[0][0]              
                                                                 res4b_relu[0][0]                 
__________________________________________________________________________________________________
res4c_relu (Activation)         (None, None, None, 1 0           res4c[0][0]                      
__________________________________________________________________________________________________
res4d_branch2a (Conv2D)         (None, None, None, 2 262144      res4c_relu[0][0]                 
__________________________________________________________________________________________________
bn4d_branch2a (BatchNormalizati (None, None, None, 2 1024        res4d_branch2a[0][0]             
__________________________________________________________________________________________________
res4d_branch2a_relu (Activation (None, None, None, 2 0           bn4d_branch2a[0][0]              
__________________________________________________________________________________________________
padding4d_branch2b (ZeroPadding (None, None, None, 2 0           res4d_branch2a_relu[0][0]        
__________________________________________________________________________________________________
res4d_branch2b (Conv2D)         (None, None, None, 2 589824      padding4d_branch2b[0][0]         
__________________________________________________________________________________________________
bn4d_branch2b (BatchNormalizati (None, None, None, 2 1024        res4d_branch2b[0][0]             
__________________________________________________________________________________________________
res4d_branch2b_relu (Activation (None, None, None, 2 0           bn4d_branch2b[0][0]              
__________________________________________________________________________________________________
res4d_branch2c (Conv2D)         (None, None, None, 1 262144      res4d_branch2b_relu[0][0]        
__________________________________________________________________________________________________
bn4d_branch2c (BatchNormalizati (None, None, None, 1 4096        res4d_branch2c[0][0]             
__________________________________________________________________________________________________
res4d (Add)                     (None, None, None, 1 0           bn4d_branch2c[0][0]              
                                                                 res4c_relu[0][0]                 
__________________________________________________________________________________________________
res4d_relu (Activation)         (None, None, None, 1 0           res4d[0][0]                      
__________________________________________________________________________________________________
res4e_branch2a (Conv2D)         (None, None, None, 2 262144      res4d_relu[0][0]                 
__________________________________________________________________________________________________
bn4e_branch2a (BatchNormalizati (None, None, None, 2 1024        res4e_branch2a[0][0]             
__________________________________________________________________________________________________
res4e_branch2a_relu (Activation (None, None, None, 2 0           bn4e_branch2a[0][0]              
__________________________________________________________________________________________________
padding4e_branch2b (ZeroPadding (None, None, None, 2 0           res4e_branch2a_relu[0][0]        
__________________________________________________________________________________________________
res4e_branch2b (Conv2D)         (None, None, None, 2 589824      padding4e_branch2b[0][0]         
__________________________________________________________________________________________________
bn4e_branch2b (BatchNormalizati (None, None, None, 2 1024        res4e_branch2b[0][0]             
__________________________________________________________________________________________________
res4e_branch2b_relu (Activation (None, None, None, 2 0           bn4e_branch2b[0][0]              
__________________________________________________________________________________________________
res4e_branch2c (Conv2D)         (None, None, None, 1 262144      res4e_branch2b_relu[0][0]        
__________________________________________________________________________________________________
bn4e_branch2c (BatchNormalizati (None, None, None, 1 4096        res4e_branch2c[0][0]             
__________________________________________________________________________________________________
res4e (Add)                     (None, None, None, 1 0           bn4e_branch2c[0][0]              
                                                                 res4d_relu[0][0]                 
__________________________________________________________________________________________________
res4e_relu (Activation)         (None, None, None, 1 0           res4e[0][0]                      
__________________________________________________________________________________________________
res4f_branch2a (Conv2D)         (None, None, None, 2 262144      res4e_relu[0][0]                 
__________________________________________________________________________________________________
bn4f_branch2a (BatchNormalizati (None, None, None, 2 1024        res4f_branch2a[0][0]             
__________________________________________________________________________________________________
res4f_branch2a_relu (Activation (None, None, None, 2 0           bn4f_branch2a[0][0]              
__________________________________________________________________________________________________
padding4f_branch2b (ZeroPadding (None, None, None, 2 0           res4f_branch2a_relu[0][0]        
__________________________________________________________________________________________________
res4f_branch2b (Conv2D)         (None, None, None, 2 589824      padding4f_branch2b[0][0]         
__________________________________________________________________________________________________
bn4f_branch2b (BatchNormalizati (None, None, None, 2 1024        res4f_branch2b[0][0]             
__________________________________________________________________________________________________
res4f_branch2b_relu (Activation (None, None, None, 2 0           bn4f_branch2b[0][0]              
__________________________________________________________________________________________________
res4f_branch2c (Conv2D)         (None, None, None, 1 262144      res4f_branch2b_relu[0][0]        
__________________________________________________________________________________________________
bn4f_branch2c (BatchNormalizati (None, None, None, 1 4096        res4f_branch2c[0][0]             
__________________________________________________________________________________________________
res4f (Add)                     (None, None, None, 1 0           bn4f_branch2c[0][0]              
                                                                 res4e_relu[0][0]                 
__________________________________________________________________________________________________
res4f_relu (Activation)         (None, None, None, 1 0           res4f[0][0]                      
__________________________________________________________________________________________________
res5a_branch2a (Conv2D)         (None, None, None, 5 524288      res4f_relu[0][0]                 
__________________________________________________________________________________________________
bn5a_branch2a (BatchNormalizati (None, None, None, 5 2048        res5a_branch2a[0][0]             
__________________________________________________________________________________________________
res5a_branch2a_relu (Activation (None, None, None, 5 0           bn5a_branch2a[0][0]              
__________________________________________________________________________________________________
padding5a_branch2b (ZeroPadding (None, None, None, 5 0           res5a_branch2a_relu[0][0]        
__________________________________________________________________________________________________
res5a_branch2b (Conv2D)         (None, None, None, 5 2359296     padding5a_branch2b[0][0]         
__________________________________________________________________________________________________
bn5a_branch2b (BatchNormalizati (None, None, None, 5 2048        res5a_branch2b[0][0]             
__________________________________________________________________________________________________
res5a_branch2b_relu (Activation (None, None, None, 5 0           bn5a_branch2b[0][0]              
__________________________________________________________________________________________________
res5a_branch2c (Conv2D)         (None, None, None, 2 1048576     res5a_branch2b_relu[0][0]        
__________________________________________________________________________________________________
res5a_branch1 (Conv2D)          (None, None, None, 2 2097152     res4f_relu[0][0]                 
__________________________________________________________________________________________________
bn5a_branch2c (BatchNormalizati (None, None, None, 2 8192        res5a_branch2c[0][0]             
__________________________________________________________________________________________________
bn5a_branch1 (BatchNormalizatio (None, None, None, 2 8192        res5a_branch1[0][0]              
__________________________________________________________________________________________________
res5a (Add)                     (None, None, None, 2 0           bn5a_branch2c[0][0]              
                                                                 bn5a_branch1[0][0]               
__________________________________________________________________________________________________
res5a_relu (Activation)         (None, None, None, 2 0           res5a[0][0]                      
__________________________________________________________________________________________________
res5b_branch2a (Conv2D)         (None, None, None, 5 1048576     res5a_relu[0][0]                 
__________________________________________________________________________________________________
bn5b_branch2a (BatchNormalizati (None, None, None, 5 2048        res5b_branch2a[0][0]             
__________________________________________________________________________________________________
res5b_branch2a_relu (Activation (None, None, None, 5 0           bn5b_branch2a[0][0]              
__________________________________________________________________________________________________
padding5b_branch2b (ZeroPadding (None, None, None, 5 0           res5b_branch2a_relu[0][0]        
__________________________________________________________________________________________________
res5b_branch2b (Conv2D)         (None, None, None, 5 2359296     padding5b_branch2b[0][0]         
__________________________________________________________________________________________________
bn5b_branch2b (BatchNormalizati (None, None, None, 5 2048        res5b_branch2b[0][0]             
__________________________________________________________________________________________________
res5b_branch2b_relu (Activation (None, None, None, 5 0           bn5b_branch2b[0][0]              
__________________________________________________________________________________________________
res5b_branch2c (Conv2D)         (None, None, None, 2 1048576     res5b_branch2b_relu[0][0]        
__________________________________________________________________________________________________
bn5b_branch2c (BatchNormalizati (None, None, None, 2 8192        res5b_branch2c[0][0]             
__________________________________________________________________________________________________
res5b (Add)                     (None, None, None, 2 0           bn5b_branch2c[0][0]              
                                                                 res5a_relu[0][0]                 
__________________________________________________________________________________________________
res5b_relu (Activation)         (None, None, None, 2 0           res5b[0][0]                      
__________________________________________________________________________________________________
res5c_branch2a (Conv2D)         (None, None, None, 5 1048576     res5b_relu[0][0]                 
__________________________________________________________________________________________________
bn5c_branch2a (BatchNormalizati (None, None, None, 5 2048        res5c_branch2a[0][0]             
__________________________________________________________________________________________________
res5c_branch2a_relu (Activation (None, None, None, 5 0           bn5c_branch2a[0][0]              
__________________________________________________________________________________________________
padding5c_branch2b (ZeroPadding (None, None, None, 5 0           res5c_branch2a_relu[0][0]        
__________________________________________________________________________________________________
res5c_branch2b (Conv2D)         (None, None, None, 5 2359296     padding5c_branch2b[0][0]         
__________________________________________________________________________________________________
bn5c_branch2b (BatchNormalizati (None, None, None, 5 2048        res5c_branch2b[0][0]             
__________________________________________________________________________________________________
res5c_branch2b_relu (Activation (None, None, None, 5 0           bn5c_branch2b[0][0]              
__________________________________________________________________________________________________
res5c_branch2c (Conv2D)         (None, None, None, 2 1048576     res5c_branch2b_relu[0][0]        
__________________________________________________________________________________________________
bn5c_branch2c (BatchNormalizati (None, None, None, 2 8192        res5c_branch2c[0][0]             
__________________________________________________________________________________________________
res5c (Add)                     (None, None, None, 2 0           bn5c_branch2c[0][0]              
                                                                 res5b_relu[0][0]                 
__________________________________________________________________________________________________
res5c_relu (Activation)         (None, None, None, 2 0           res5c[0][0]                      
__________________________________________________________________________________________________
C5_reduced (Conv2D)             (None, None, None, 2 524544      res5c_relu[0][0]                 
__________________________________________________________________________________________________
P5_upsampled (UpsampleLike)     (None, None, None, 2 0           C5_reduced[0][0]                 
                                                                 res4f_relu[0][0]                 
__________________________________________________________________________________________________
C4_reduced (Conv2D)             (None, None, None, 2 262400      res4f_relu[0][0]                 
__________________________________________________________________________________________________
P4_merged (Add)                 (None, None, None, 2 0           P5_upsampled[0][0]               
                                                                 C4_reduced[0][0]                 
__________________________________________________________________________________________________
P4_upsampled (UpsampleLike)     (None, None, None, 2 0           P4_merged[0][0]                  
                                                                 res3d_relu[0][0]                 
__________________________________________________________________________________________________
C3_reduced (Conv2D)             (None, None, None, 2 131328      res3d_relu[0][0]                 
__________________________________________________________________________________________________
P6 (Conv2D)                     (None, None, None, 2 4718848     res5c_relu[0][0]                 
__________________________________________________________________________________________________
P3_merged (Add)                 (None, None, None, 2 0           P4_upsampled[0][0]               
                                                                 C3_reduced[0][0]                 
__________________________________________________________________________________________________
C6_relu (Activation)            (None, None, None, 2 0           P6[0][0]                         
__________________________________________________________________________________________________
P3 (Conv2D)                     (None, None, None, 2 590080      P3_merged[0][0]                  
__________________________________________________________________________________________________
P4 (Conv2D)                     (None, None, None, 2 590080      P4_merged[0][0]                  
__________________________________________________________________________________________________
P5 (Conv2D)                     (None, None, None, 2 590080      C5_reduced[0][0]                 
__________________________________________________________________________________________________
P7 (Conv2D)                     (None, None, None, 2 590080      C6_relu[0][0]                    
__________________________________________________________________________________________________
regression_submodel (Model)     (None, None, 4)      2470960     P3[0][0]                         
                                                                 P4[0][0]                         
                                                                 P5[0][0]                         
                                                                 P6[0][0]                         
                                                                 P7[0][0]                         
__________________________________________________________________________________________________
anchors_0 (Anchors)             (None, None, 4)      0           P3[0][0]                         
__________________________________________________________________________________________________
anchors_1 (Anchors)             (None, None, 4)      0           P4[0][0]                         
__________________________________________________________________________________________________
anchors_2 (Anchors)             (None, None, 4)      0           P5[0][0]                         
__________________________________________________________________________________________________
anchors_3 (Anchors)             (None, None, 4)      0           P6[0][0]                         
__________________________________________________________________________________________________
anchors_4 (Anchors)             (None, None, 4)      0           P7[0][0]                         
__________________________________________________________________________________________________
regression (Concatenate)        (None, None, 4)      0           regression_submodel[1][0]        
                                                                 regression_submodel[2][0]        
                                                                 regression_submodel[3][0]        
                                                                 regression_submodel[4][0]        
                                                                 regression_submodel[5][0]        
__________________________________________________________________________________________________
anchors (Concatenate)           (None, None, 4)      0           anchors_0[0][0]                  
                                                                 anchors_1[0][0]                  
                                                                 anchors_2[0][0]                  
                                                                 anchors_3[0][0]                  
                                                                 anchors_4[0][0]                  
__________________________________________________________________________________________________
classification_submodel (Model) (None, None, 333)    11571100    P3[0][0]                         
                                                                 P4[0][0]                         
                                                                 P5[0][0]                         
                                                                 P6[0][0]                         
                                                                 P7[0][0]                         
__________________________________________________________________________________________________
boxes (RegressBoxes)            (None, None, 4)      0           anchors[0][0]                    
                                                                 regression[0][0]                 
__________________________________________________________________________________________________
classification (Concatenate)    (None, None, 333)    0           classification_submodel[1][0]    
                                                                 classification_submodel[2][0]    
                                                                 classification_submodel[3][0]    
                                                                 classification_submodel[4][0]    
                                                                 classification_submodel[5][0]    
__________________________________________________________________________________________________
clipped_boxes (ClipBoxes)       (None, None, 4)      0           image[0][0]                      
                                                                 boxes[0][0]                      
__________________________________________________________________________________________________
filtered_detections (FilterDete [(None, 100, 4), (No 0           clipped_boxes[0][0]              
                                                                 classification[0][0]             
__________________________________________________________________________________________________
shape_14 (Shape)                (4,)                 0           image[0][0]                      
__________________________________________________________________________________________________
roi_align_13 (RoiAlign)         (None, None, 14, 14, 0           shape_14[0][0]                   
                                                                 filtered_detections[0][0]        
                                                                 filtered_detections[0][1]        
                                                                 P3[0][0]                         
                                                                 P4[0][0]                         
                                                                 P5[0][0]                         
                                                                 P6[0][0]                         
                                                                 P7[0][0]                         
__________________________________________________________________________________________________
mask_submodel (Model)           (None, None, 28, 28, 3035981     roi_align_13[0][0]               
__________________________________________________________________________________________________
masks (ConcatenateBoxes)        (None, 100, 261076)  0           filtered_detections[0][0]        
                                                                 mask_submodel[1][0]              
==================================================================================================
Total params: 48,636,633
Trainable params: 25,075,481
Non-trainable params: 23,561,152
hgaiser commented 4 years ago

The mask_loss being 0 makes sense. It only starts to have a loss when there are positive detections made, meanwhile it is 0. It's strange that the regression and classification losses don't decrease though. Do they decrease if you don't use freezing?

tpostadjian commented 4 years ago

I do not really get why masks_loss is supposed to be 0. I mean, it can only be different from 0 when the net finds a positive match ? Can you clarify on the mask loss please, I am not sure I got it right ? As for the regression and classification, they were decreasing, although regression was decreasing much slower than classification. I tried to unfreeze but it takes so long to manage to finalize at least one epoch that I did not wait until the first epoch was completed. However, casting an eye from time to time during this epoch, I dit not notice any better improvements. What is loss purpose ? Should I care about it ?

Thank you for your help by the way !

hgaiser commented 4 years ago

I do not really get why masks_loss is supposed to be 0. I mean, it can only be different from 0 when the net finds a positive match ?

You are right that if the net finds a positive match, it is nonzero, but if it doesn't find any positive matches then what is the mask loss? Since the mask loss can only be computed on positive matches, it is zero when there are none.

You could try your dataset on keras-retinanet first to see if you get results there. If you don't get any bounding box results then you're not going to get mask results here.

By the way, if you have a trained keras-retinanet model, you can use its weights to initialize keras-maskrcnn since they share mostly the same topology.

tpostadjian commented 4 years ago

I have followed your advice and checked if keras_retinanet is behaving as expected during training. Well, it does not ! The regression_loss gets stuck around 2.95 With the debug tool from keras_maskrcnn, I managed to display some samples in the generators, and anchors are correctly set so I am quite unsure where the problem might be...

  159/14824 [..............................] - ETA: 5:25:14 - loss: 4.8434 - regression_loss: 2.9472 - classification_loss: 1.8962
  160/14824 [..............................] - ETA: 5:25:07 - loss: 4.8438 - regression_loss: 2.9453 - classification_loss: 1.8985
  161/14824 [..............................] - ETA: 5:24:56 - loss: 4.8432 - regression_loss: 2.9430 - classification_loss: 1.9002
  162/14824 [..............................] - ETA: 5:24:46 - loss: 4.8437 - regression_loss: 2.9407 - classification_loss: 1.9030
  163/14824 [..............................] - ETA: 5:24:35 - loss: 4.8416 - regression_loss: 2.9404 - classification_loss: 1.9012
  164/14824 [..............................] - ETA: 5:24:24 - loss: 4.8398 - regression_loss: 2.9404 - classification_loss: 1.8994
  165/14824 [..............................] - ETA: 5:24:15 - loss: 4.8414 - regression_loss: 2.9419 - classification_loss: 1.8995
  166/14824 [..............................] - ETA: 5:24:06 - loss: 4.8416 - regression_loss: 2.9414 - classification_loss: 1.9002
  167/14824 [..............................] - ETA: 5:23:56 - loss: 4.8419 - regression_loss: 2.9419 - classification_loss: 1.9000
  168/14824 [..............................] - ETA: 5:23:47 - loss: 4.8385 - regression_loss: 2.9413 - classification_loss: 1.8971
  169/14824 [..............................] - ETA: 5:23:37 - loss: 4.8366 - regression_loss: 2.9412 - classification_loss: 1.8954
  170/14824 [..............................] - ETA: 5:23:28 - loss: 4.8367 - regression_loss: 2.9414 - classification_loss: 1.8953
  171/14824 [..............................] - ETA: 5:23:21 - loss: 4.8349 - regression_loss: 2.9416 - classification_loss: 1.8934
  172/14824 [..............................] - ETA: 5:23:13 - loss: 4.8323 - regression_loss: 2.9408 - classification_loss: 1.8915
  173/14824 [..............................] - ETA: 5:23:06 - loss: 4.8321 - regression_loss: 2.9388 - classification_loss: 1.8933
  174/14824 [..............................] - ETA: 5:22:57 - loss: 4.8324 - regression_loss: 2.9395 - classification_loss: 1.8929
  175/14824 [..............................] - ETA: 5:22:50 - loss: 4.8328 - regression_loss: 2.9392 - classification_loss: 1.8936
  176/14824 [..............................] - ETA: 5:22:43 - loss: 4.8312 - regression_loss: 2.9401 - classification_loss: 1.8912
  177/14824 [..............................] - ETA: 5:22:35 - loss: 4.8304 - regression_loss: 2.9416 - classification_loss: 1.8888
  178/14824 [..............................] - ETA: 5:22:27 - loss: 4.8290 - regression_loss: 2.9423 - classification_loss: 1.8867
  179/14824 [..............................] - ETA: 5:22:18 - loss: 4.8289 - regression_loss: 2.9421 - classification_loss: 1.8868
  180/14824 [..............................] - ETA: 5:22:11 - loss: 4.8294 - regression_loss: 2.9419 - classification_loss: 1.8875
  181/14824 [..............................] - ETA: 5:22:04 - loss: 4.8284 - regression_loss: 2.9435 - classification_loss: 1.8849
  182/14824 [..............................] - ETA: 5:21:56 - loss: 4.8279 - regression_loss: 2.9423 - classification_loss: 1.8856
  183/14824 [..............................] - ETA: 5:21:48 - loss: 4.8291 - regression_loss: 2.9441 - classification_loss: 1.8850
  184/14824 [..............................] - ETA: 5:21:40 - loss: 4.8291 - regression_loss: 2.9436 - classification_loss: 1.8855
  185/14824 [..............................] - ETA: 5:21:33 - loss: 4.8285 - regression_loss: 2.9420 - classification_loss: 1.8865
  186/14824 [..............................] - ETA: 5:21:26 - loss: 4.8286 - regression_loss: 2.9426 - classification_loss: 1.8860
  187/14824 [..............................] - ETA: 5:21:18 - loss: 4.8261 - regression_loss: 2.9425 - classification_loss: 1.8836
  188/14824 [..............................] - ETA: 5:21:11 - loss: 4.8263 - regression_loss: 2.9437 - classification_loss: 1.8826
  189/14824 [..............................] - ETA: 5:21:03 - loss: 4.8253 - regression_loss: 2.9428 - classification_loss: 1.8824
  190/14824 [..............................] - ETA: 5:20:57 - loss: 4.8262 - regression_loss: 2.9437 - classification_loss: 1.8825
  191/14824 [..............................] - ETA: 5:20:50 - loss: 4.8254 - regression_loss: 2.9412 - classification_loss: 1.8842
  192/14824 [..............................] - ETA: 5:20:43 - loss: 4.8234 - regression_loss: 2.9409 - classification_loss: 1.8826
  193/14824 [..............................] - ETA: 5:20:36 - loss: 4.8240 - regression_loss: 2.9411 - classification_loss: 1.8829
  194/14824 [..............................] - ETA: 5:20:30 - loss: 4.8211 - regression_loss: 2.9401 - classification_loss: 1.8810
  195/14824 [..............................] - ETA: 5:20:24 - loss: 4.8193 - regression_loss: 2.9411 - classification_loss: 1.8782
  196/14824 [..............................] - ETA: 5:20:17 - loss: 4.8182 - regression_loss: 2.9418 - classification_loss: 1.8764
  197/14824 [..............................] - ETA: 5:20:10 - loss: 4.8160 - regression_loss: 2.9417 - classification_loss: 1.8743
  198/14824 [..............................] - ETA: 5:20:04 - loss: 4.8147 - regression_loss: 2.9414 - classification_loss: 1.8733
  199/14824 [..............................] - ETA: 5:19:58 - loss: 4.8131 - regression_loss: 2.9420 - classification_loss: 1.8711
  200/14824 [..............................] - ETA: 5:19:51 - loss: 4.8112 - regression_loss: 2.9425 - classification_loss: 1.8687
  201/14824 [..............................] - ETA: 5:19:46 - loss: 4.8091 - regression_loss: 2.9421 - classification_loss: 1.8669
  202/14824 [..............................] - ETA: 5:19:41 - loss: 4.8091 - regression_loss: 2.9428 - classification_loss: 1.8662
  203/14824 [..............................] - ETA: 5:19:35 - loss: 4.8073 - regression_loss: 2.9436 - classification_loss: 1.8637
  204/14824 [..............................] - ETA: 5:19:29 - loss: 4.8074 - regression_loss: 2.9448 - classification_loss: 1.8627
  205/14824 [..............................] - ETA: 5:19:24 - loss: 4.8074 - regression_loss: 2.9460 - classification_loss: 1.8615
  206/14824 [..............................] - ETA: 5:19:19 - loss: 4.8064 - regression_loss: 2.9457 - classification_loss: 1.8607
  207/14824 [..............................] - ETA: 5:19:13 - loss: 4.8050 - regression_loss: 2.9465 - classification_loss: 1.8586
  208/14824 [..............................] - ETA: 5:19:06 - loss: 4.8058 - regression_loss: 2.9479 - classification_loss: 1.8579
  209/14824 [..............................] - ETA: 5:19:00 - loss: 4.8049 - regression_loss: 2.9483 - classification_loss: 1.8566
  210/14824 [..............................] - ETA: 5:18:54 - loss: 4.8034 - regression_loss: 2.9493 - classification_loss: 1.8541
  211/14824 [..............................] - ETA: 5:18:49 - loss: 4.8034 - regression_loss: 2.9507 - classification_loss: 1.8527
  212/14824 [..............................] - ETA: 5:18:43 - loss: 4.8012 - regression_loss: 2.9510 - classification_loss: 1.8502
  213/14824 [..............................] - ETA: 5:18:37 - loss: 4.7981 - regression_loss: 2.9487 - classification_loss: 1.8494
  214/14824 [..............................] - ETA: 5:18:32 - loss: 4.7962 - regression_loss: 2.9488 - classification_loss: 1.8474
  215/14824 [..............................] - ETA: 5:18:27 - loss: 4.7943 - regression_loss: 2.9492 - classification_loss: 1.8451
  216/14824 [..............................] - ETA: 5:18:22 - loss: 4.7926 - regression_loss: 2.9487 - classification_loss: 1.8439
  217/14824 [..............................] - ETA: 5:18:16 - loss: 4.7913 - regression_loss: 2.9485 - classification_loss: 1.8428
  218/14824 [..............................] - ETA: 5:18:11 - loss: 4.7883 - regression_loss: 2.9469 - classification_loss: 1.8414
  219/14824 [..............................] - ETA: 5:18:06 - loss: 4.7898 - regression_loss: 2.9495 - classification_loss: 1.8404
  220/14824 [..............................] - ETA: 5:17:58 - loss: 4.7884 - regression_loss: 2.9499 - classification_loss: 1.8385
  221/14824 [..............................] - ETA: 5:17:51 - loss: 4.7876 - regression_loss: 2.9510 - classification_loss: 1.8365
  222/14824 [..............................] - ETA: 5:17:47 - loss: 4.7860 - regression_loss: 2.9517 - classification_loss: 1.8342
  223/14824 [..............................] - ETA: 5:17:42 - loss: 4.7838 - regression_loss: 2.9512 - classification_loss: 1.8325
  224/14824 [..............................] - ETA: 5:17:37 - loss: 4.7812 - regression_loss: 2.9506 - classification_loss: 1.8306
  225/14824 [..............................] - ETA: 5:17:31 - loss: 4.7794 - regression_loss: 2.9502 - classification_loss: 1.8292
  226/14824 [..............................] - ETA: 5:17:27 - loss: 4.7775 - regression_loss: 2.9501 - classification_loss: 1.8274
  227/14824 [..............................] - ETA: 5:17:21 - loss: 4.7755 - regression_loss: 2.9496 - classification_loss: 1.8260
  228/14824 [..............................] - ETA: 5:17:16 - loss: 4.7744 - regression_loss: 2.9498 - classification_loss: 1.8246
  229/14824 [..............................] - ETA: 5:17:11 - loss: 4.7718 - regression_loss: 2.9492 - classification_loss: 1.8226
  230/14824 [..............................] - ETA: 5:17:06 - loss: 4.7697 - regression_loss: 2.9489 - classification_loss: 1.8208
  231/14824 [..............................] - ETA: 5:17:00 - loss: 4.7684 - regression_loss: 2.9492 - classification_loss: 1.8191
  232/14824 [..............................] - ETA: 5:16:56 - loss: 4.7662 - regression_loss: 2.9488 - classification_loss: 1.8174
  233/14824 [..............................] - ETA: 5:16:53 - loss: 4.7650 - regression_loss: 2.9496 - classification_loss: 1.8154
  234/14824 [..............................] - ETA: 5:16:49 - loss: 4.7626 - regression_loss: 2.9490 - classification_loss: 1.8137
  235/14824 [..............................] - ETA: 5:16:44 - loss: 4.7611 - regression_loss: 2.9496 - classification_loss: 1.8114
  236/14824 [..............................] - ETA: 5:16:39 - loss: 4.7588 - regression_loss: 2.9496 - classification_loss: 1.8092
  237/14824 [..............................] - ETA: 5:16:35 - loss: 4.7571 - regression_loss: 2.9497 - classification_loss: 1.8073
  238/14824 [..............................] - ETA: 5:16:33 - loss: 4.7561 - regression_loss: 2.9482 - classification_loss: 1.8078
  239/14824 [..............................] - ETA: 5:16:32 - loss: 4.7556 - regression_loss: 2.9497 - classification_loss: 1.8058
  240/14824 [..............................] - ETA: 5:16:28 - loss: 4.7550 - regression_loss: 2.9510 - classification_loss: 1.8040
  241/14824 [..............................] - ETA: 5:16:23 - loss: 4.7528 - regression_loss: 2.9510 - classification_loss: 1.8018
  242/14824 [..............................] - ETA: 5:16:20 - loss: 4.7509 - regression_loss: 2.9506 - classification_loss: 1.8003
  243/14824 [..............................] - ETA: 5:16:16 - loss: 4.7491 - regression_loss: 2.9506 - classification_loss: 1.7984
  244/14824 [..............................] - ETA: 5:16:12 - loss: 4.7478 - regression_loss: 2.9514 - classification_loss: 1.7964
  245/14824 [..............................] - ETA: 5:16:07 - loss: 4.7467 - regression_loss: 2.9521 - classification_loss: 1.7946
  246/14824 [..............................] - ETA: 5:16:04 - loss: 4.7451 - regression_loss: 2.9525 - classification_loss: 1.7926
  247/14824 [..............................] - ETA: 5:16:01 - loss: 4.7434 - regression_loss: 2.9529 - classification_loss: 1.7905
  248/14824 [..............................] - ETA: 5:15:58 - loss: 4.7421 - regression_loss: 2.9535 - classification_loss: 1.7885
  249/14824 [..............................] - ETA: 5:15:53 - loss: 4.7406 - regression_loss: 2.9534 - classification_loss: 1.7873
  250/14824 [..............................] - ETA: 5:15:49 - loss: 4.7394 - regression_loss: 2.9540 - classification_loss: 1.7854
  251/14824 [..............................] - ETA: 5:15:47 - loss: 4.7374 - regression_loss: 2.9539 - classification_loss: 1.7835
  252/14824 [..............................] - ETA: 5:15:43 - loss: 4.7348 - regression_loss: 2.9524 - classification_loss: 1.7824
  253/14824 [..............................] - ETA: 5:15:39 - loss: 4.7336 - regression_loss: 2.9531 - classification_loss: 1.7805
  254/14824 [..............................] - ETA: 5:15:35 - loss: 4.7319 - regression_loss: 2.9525 - classification_loss: 1.7794
  255/14824 [..............................] - ETA: 5:15:30 - loss: 4.7293 - regression_loss: 2.9515 - classification_loss: 1.7778
  256/14824 [..............................] - ETA: 5:15:27 - loss: 4.7274 - regression_loss: 2.9516 - classification_loss: 1.7759
  257/14824 [..............................] - ETA: 5:15:23 - loss: 4.7255 - regression_loss: 2.9514 - classification_loss: 1.7741
  258/14824 [..............................] - ETA: 5:15:18 - loss: 4.7238 - regression_loss: 2.9516 - classification_loss: 1.7722
  259/14824 [..............................] - ETA: 5:15:15 - loss: 4.7229 - regression_loss: 2.9525 - classification_loss: 1.7704
  260/14824 [..............................] - ETA: 5:15:12 - loss: 4.7206 - regression_loss: 2.9520 - classification_loss: 1.7686
  261/14824 [..............................] - ETA: 5:15:13 - loss: 4.7186 - regression_loss: 2.9517 - classification_loss: 1.7668
  262/14824 [..............................] - ETA: 5:15:08 - loss: 4.7164 - regression_loss: 2.9514 - classification_loss: 1.7650
  263/14824 [..............................] - ETA: 5:15:07 - loss: 4.7137 - regression_loss: 2.9504 - classification_loss: 1.7633
  264/14824 [..............................] - ETA: 5:15:03 - loss: 4.7112 - regression_loss: 2.9493 - classification_loss: 1.7619
  265/14824 [..............................] - ETA: 5:14:59 - loss: 4.7094 - regression_loss: 2.9493 - classification_loss: 1.7601
  266/14824 [..............................] - ETA: 5:14:56 - loss: 4.7067 - regression_loss: 2.9479 - classification_loss: 1.7587
  267/14824 [..............................] - ETA: 5:14:54 - loss: 4.7051 - regression_loss: 2.9482 - classification_loss: 1.7569
  268/14824 [..............................] - ETA: 5:14:50 - loss: 4.7033 - regression_loss: 2.9482 - classification_loss: 1.7552
  269/14824 [..............................] - ETA: 5:14:46 - loss: 4.7022 - regression_loss: 2.9488 - classification_loss: 1.7534
  270/14824 [..............................] - ETA: 5:14:43 - loss: 4.7012 - regression_loss: 2.9485 - classification_loss: 1.7526
  271/14824 [..............................] - ETA: 5:14:40 - loss: 4.6998 - regression_loss: 2.9487 - classification_loss: 1.7511
  272/14824 [..............................] - ETA: 5:14:36 - loss: 4.6981 - regression_loss: 2.9484 - classification_loss: 1.7496
  273/14824 [..............................] - ETA: 5:14:32 - loss: 4.6965 - regression_loss: 2.9486 - classification_loss: 1.7480
  274/14824 [..............................] - ETA: 5:14:26 - loss: 4.6940 - regression_loss: 2.9477 - classification_loss: 1.7463
  275/14824 [..............................] - ETA: 5:14:22 - loss: 4.6914 - regression_loss: 2.9468 - classification_loss: 1.7446
  276/14824 [..............................] - ETA: 5:14:19 - loss: 4.6900 - regression_loss: 2.9472 - classification_loss: 1.7428
  277/14824 [..............................] - ETA: 5:14:17 - loss: 4.6893 - regression_loss: 2.9478 - classification_loss: 1.7415
  278/14824 [..............................] - ETA: 5:14:12 - loss: 4.6883 - regression_loss: 2.9481 - classification_loss: 1.7402
  279/14824 [..............................] - ETA: 5:14:09 - loss: 4.6873 - regression_loss: 2.9485 - classification_loss: 1.7389
  280/14824 [..............................] - ETA: 5:14:06 - loss: 4.6862 - regression_loss: 2.9489 - classification_loss: 1.7374
  281/14824 [..............................] - ETA: 5:14:03 - loss: 4.6841 - regression_loss: 2.9483 - classification_loss: 1.7358
  282/14824 [..............................] - ETA: 5:14:00 - loss: 4.6815 - regression_loss: 2.9471 - classification_loss: 1.7344
  283/14824 [..............................] - ETA: 5:13:56 - loss: 4.6792 - regression_loss: 2.9465 - classification_loss: 1.7327
  284/14824 [..............................] - ETA: 5:13:53 - loss: 4.6783 - regression_loss: 2.9466 - classification_loss: 1.7316
  285/14824 [..............................] - ETA: 5:13:49 - loss: 4.6773 - regression_loss: 2.9471 - classification_loss: 1.7302
  286/14824 [..............................] - ETA: 5:13:45 - loss: 4.6761 - regression_loss: 2.9475 - classification_loss: 1.7285
  287/14824 [..............................] - ETA: 5:13:42 - loss: 4.6767 - regression_loss: 2.9490 - classification_loss: 1.7277
  288/14824 [..............................] - ETA: 5:13:43 - loss: 4.6757 - regression_loss: 2.9494 - classification_loss: 1.7263
  289/14824 [..............................] - ETA: 5:13:39 - loss: 4.6750 - regression_loss: 2.9502 - classification_loss: 1.7249
  290/14824 [..............................] - ETA: 5:13:36 - loss: 4.6749 - regression_loss: 2.9508 - classification_loss: 1.7241
  291/14824 [..............................] - ETA: 5:13:37 - loss: 4.6729 - regression_loss: 2.9502 - classification_loss: 1.7227
  292/14824 [..............................] - ETA: 5:13:34 - loss: 4.6728 - regression_loss: 2.9516 - classification_loss: 1.7212
  293/14824 [..............................] - ETA: 5:13:30 - loss: 4.6713 - regression_loss: 2.9513 - classification_loss: 1.7199
  294/14824 [..............................] - ETA: 5:13:28 - loss: 4.6696 - regression_loss: 2.9511 - classification_loss: 1.7185
  295/14824 [..............................] - ETA: 5:13:26 - loss: 4.6677 - regression_loss: 2.9507 - classification_loss: 1.7170
  296/14824 [..............................] - ETA: 5:13:23 - loss: 4.6662 - regression_loss: 2.9508 - classification_loss: 1.7154
  297/14824 [..............................] - ETA: 5:13:19 - loss: 4.6653 - regression_loss: 2.9515 - classification_loss: 1.7138
  298/14824 [..............................] - ETA: 5:13:15 - loss: 4.6649 - regression_loss: 2.9526 - classification_loss: 1.7122
  299/14824 [..............................] - ETA: 5:13:17 - loss: 4.6633 - regression_loss: 2.9524 - classification_loss: 1.7109
  300/14824 [..............................] - ETA: 5:13:14 - loss: 4.6624 - regression_loss: 2.9528 - classification_loss: 1.7096
  301/14824 [..............................] - ETA: 5:13:11 - loss: 4.6611 - regression_loss: 2.9529 - classification_loss: 1.7082
  302/14824 [..............................] - ETA: 5:13:12 - loss: 4.6599 - regression_loss: 2.9530 - classification_loss: 1.7069
  303/14824 [..............................] - ETA: 5:13:08 - loss: 4.6579 - regression_loss: 2.9526 - classification_loss: 1.7053
  304/14824 [..............................] - ETA: 5:13:05 - loss: 4.6557 - regression_loss: 2.9519 - classification_loss: 1.7038
  305/14824 [..............................] - ETA: 5:13:02 - loss: 4.6546 - regression_loss: 2.9522 - classification_loss: 1.7023
  306/14824 [..............................] - ETA: 5:13:00 - loss: 4.6535 - regression_loss: 2.9526 - classification_loss: 1.7009
  307/14824 [..............................] - ETA: 5:12:57 - loss: 4.6511 - regression_loss: 2.9514 - classification_loss: 1.6997
  308/14824 [..............................] - ETA: 5:12:53 - loss: 4.6485 - regression_loss: 2.9503 - classification_loss: 1.6982
  309/14824 [..............................] - ETA: 5:12:50 - loss: 4.6459 - regression_loss: 2.9491 - classification_loss: 1.6967
  310/14824 [..............................] - ETA: 5:12:47 - loss: 4.6443 - regression_loss: 2.9488 - classification_loss: 1.6955
  311/14824 [..............................] - ETA: 5:12:44 - loss: 4.6423 - regression_loss: 2.9481 - classification_loss: 1.6942
  312/14824 [..............................] - ETA: 5:12:41 - loss: 4.6399 - regression_loss: 2.9472 - classification_loss: 1.6927
  313/14824 [..............................] - ETA: 5:12:38 - loss: 4.6390 - regression_loss: 2.9477 - classification_loss: 1.6913
hgaiser commented 4 years ago

And just for the record, when you visualize using debug.py you probably used --annotations as well ? What color were the annotations?

tpostadjian commented 4 years ago

Using databricks, I had to tweak a bit the debug code. I essentially use the display_annotations to check images.

# Show trays with annotations + anchors + transformations
def display_annotations(generator, image_index, config_file, random_transform=True, resize=True, anchor=True, annotations=True, masks=True):
    anchor_cfg = read_config_file(config_file)
    anchor_params = parse_anchor_parameters(anchor_cfg)

    # load the data
    image       = generator.load_image(image_index)
    annotations = generator.load_annotations(image_index)

    # apply random transformations
    if random_transform:
        image, annotations = generator.random_transform_group_entry(image, annotations)

    # resize the image and annotations
    if resize:
        image, image_scale  = generator.resize_image(image)
        annotations['bboxes'] *= image_scale
        for m in range(len(annotations['masks'])):
            annotations['masks'][m], _ = generator.resize_image(annotations['masks'][m])

    anchors = anchors_for_shape(image.shape, anchor_params=anchor_params)
    positive_indices, _, max_indices = compute_gt_annotations(anchors, annotations['bboxes'])

    # draw anchors on the image
    if anchor:
        anchors = generator.generate_anchors(image.shape)
        positive_indices, _, max_indices = compute_gt_annotations(anchors, annotations['bboxes'])
        draw_boxes(image, anchors[positive_indices], (255, 255, 0), thickness=1)

    # draw annotations on the image
    if annotations:
        # draw annotations in red
        draw_annotations(image, annotations, color=(0, 0, 255), label_to_name=generator.label_to_name)

        # draw regressed anchors in green to override most red annotations
        # result is that annotations without anchors are red, with anchors are green
        draw_boxes(image, annotations['bboxes'][max_indices[positive_indices], :], (0, 255, 0))

    # Draw masks over the image with random colours
    if masks:
        for m in range(len(annotations['masks'])):
            # crop the mask with the related bbox size, and then draw them
            box = annotations['bboxes'][m].astype(int)
            mask = annotations['masks'][m][box[1]:box[3], box[0]:box[2]]
            draw_mask(image, box, mask, annotations['labels'][m].astype(int))
            # add the label caption
            caption = '{}'.format(generator.label_to_name(annotations['labels'][m]))
            draw_caption(image, box, caption)

    plt.imshow(image)
    display()

Then I simply call the function:

sample_id = 1000
display_annotations(train_generator, sample_id, "path/to/anchor/config_file.ini")

The config file:

[anchor_parameters]
# Sizes should correlate to how the network processes an image, it is not advised to change these!
sizes   = 32 64 128 256 512
# Strides should correlate to how the network strides over an image, it is not advised to change these!
strides = 8 16 32 64 128
# The different ratios to use per anchor location.
ratios  = 0.5 1 2 3
# The different scaling factors to use per anchor location.
scales  = 1 1.2 1.6
tpostadjian commented 4 years ago

So yes, annotations appear blue to answer your question ! (sorry about that)

hgaiser commented 4 years ago

So yes, annotations appear blue to answer your question ! (sorry about that)

I suppose you mean green? Or maybe the color space is reversed for you? The images are resized for you, right (as in resize=True)?

You could also try creating a dataset of just one image to see if it learns anything meaningful at all. If not and if you can share your data, I would like to see it.

tpostadjian commented 4 years ago

So I ran the training with only one training image (hence one class). The validation generator has five items of the same class. Losses don't improve even there...

Epoch 1/500

1/1 [==============================] - 2s 2s/step - loss: 4.0545 - regression_loss: 2.9200 - classification_loss: 1.1345 - val_loss: 4.2029 - val_regression_loss: 3.0699 - val_classification_loss: 1.1330
Running network: N/A% (0 of 5) |         | Elapsed Time: 0:00:00 ETA:  --:--:--
Running network:  20% (1 of 5) |#        | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  40% (2 of 5) |###      | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  60% (3 of 5) |#####    | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network:  80% (4 of 5) |#######  | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network: 100% (5 of 5) |#########| Elapsed Time: 0:00:02 Time:  0:00:02
Parsing annotations: N/A% (0 of 5) |     | Elapsed Time: 0:00:00 ETA:  --:--:--
Parsing annotations: 100% (5 of 5) |#####| Elapsed Time: 0:00:00 Time:  0:00:00
5 instances of class OR0000000182019 with average precision: 0.0000
mAP: 0.0000
Epoch 2/500

1/1 [==============================] - 1s 1s/step - loss: 3.5699 - regression_loss: 2.4362 - classification_loss: 1.1337 - val_loss: 4.2030 - val_regression_loss: 3.0701 - val_classification_loss: 1.1329
Running network: N/A% (0 of 5) |         | Elapsed Time: 0:00:00 ETA:  --:--:--
Running network:  20% (1 of 5) |#        | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  40% (2 of 5) |###      | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  60% (3 of 5) |#####    | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network:  80% (4 of 5) |#######  | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network: 100% (5 of 5) |#########| Elapsed Time: 0:00:01 Time:  0:00:01
Parsing annotations: N/A% (0 of 5) |     | Elapsed Time: 0:00:00 ETA:  --:--:--
Parsing annotations: 100% (5 of 5) |#####| Elapsed Time: 0:00:00 Time:  0:00:00
5 instances of class OR0000000182019 with average precision: 0.0000
mAP: 0.0000
Epoch 3/500

1/1 [==============================] - 1s 1s/step - loss: 4.2420 - regression_loss: 3.1099 - classification_loss: 1.1322 - val_loss: 4.2031 - val_regression_loss: 3.0703 - val_classification_loss: 1.1328
Running network: N/A% (0 of 5) |         | Elapsed Time: 0:00:00 ETA:  --:--:--
Running network:  20% (1 of 5) |#        | Elapsed Time: 0:00:00 ETA:   0:00:02
Running network:  40% (2 of 5) |###      | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  60% (3 of 5) |#####    | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network:  80% (4 of 5) |#######  | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network: 100% (5 of 5) |#########| Elapsed Time: 0:00:02 Time:  0:00:02
Parsing annotations: N/A% (0 of 5) |     | Elapsed Time: 0:00:00 ETA:  --:--:--
Parsing annotations: 100% (5 of 5) |#####| Elapsed Time: 0:00:00 Time:  0:00:00
5 instances of class OR0000000182019 with average precision: 0.0000
mAP: 0.0000
Epoch 4/500

1/1 [==============================] - 1s 1s/step - loss: 3.7935 - regression_loss: 2.6593 - classification_loss: 1.1341 - val_loss: 4.2031 - val_regression_loss: 3.0704 - val_classification_loss: 1.1327
Running network: N/A% (0 of 5) |         | Elapsed Time: 0:00:00 ETA:  --:--:--
Running network:  20% (1 of 5) |#        | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  40% (2 of 5) |###      | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  60% (3 of 5) |#####    | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network:  80% (4 of 5) |#######  | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network: 100% (5 of 5) |#########| Elapsed Time: 0:00:02 Time:  0:00:02
Parsing annotations: N/A% (0 of 5) |     | Elapsed Time: 0:00:00 ETA:  --:--:--
Parsing annotations: 100% (5 of 5) |#####| Elapsed Time: 0:00:00 Time:  0:00:00
5 instances of class OR0000000182019 with average precision: 0.0000
mAP: 0.0000

Epoch 00004: ReduceLROnPlateau reducing learning rate to 9.999999747378752e-07.
Epoch 5/500

1/1 [==============================] - 2s 2s/step - loss: 4.2275 - regression_loss: 3.0977 - classification_loss: 1.1299 - val_loss: 4.2032 - val_regression_loss: 3.0704 - val_classification_loss: 1.1327
Running network: N/A% (0 of 5) |         | Elapsed Time: 0:00:00 ETA:  --:--:--
Running network:  20% (1 of 5) |#        | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  40% (2 of 5) |###      | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  60% (3 of 5) |#####    | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network:  80% (4 of 5) |#######  | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network: 100% (5 of 5) |#########| Elapsed Time: 0:00:01 Time:  0:00:01
Parsing annotations: N/A% (0 of 5) |     | Elapsed Time: 0:00:00 ETA:  --:--:--
Parsing annotations: 100% (5 of 5) |#####| Elapsed Time: 0:00:00 Time:  0:00:00
5 instances of class OR0000000182019 with average precision: 0.0000
mAP: 0.0000
Epoch 6/500

1/1 [==============================] - 1s 1s/step - loss: 4.0218 - regression_loss: 2.8903 - classification_loss: 1.1314 - val_loss: 4.2032 - val_regression_loss: 3.0705 - val_classification_loss: 1.1327
Running network: N/A% (0 of 5) |         | Elapsed Time: 0:00:00 ETA:  --:--:--
Running network:  20% (1 of 5) |#        | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  40% (2 of 5) |###      | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  60% (3 of 5) |#####    | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network:  80% (4 of 5) |#######  | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network: 100% (5 of 5) |#########| Elapsed Time: 0:00:02 Time:  0:00:02
Parsing annotations: N/A% (0 of 5) |     | Elapsed Time: 0:00:00 ETA:  --:--:--
Parsing annotations: 100% (5 of 5) |#####| Elapsed Time: 0:00:00 Time:  0:00:00
5 instances of class OR0000000182019 with average precision: 0.0000
mAP: 0.0000

Epoch 00006: ReduceLROnPlateau reducing learning rate to 9.999999974752428e-08.
Epoch 7/500

1/1 [==============================] - 1s 1s/step - loss: 4.2530 - regression_loss: 3.1214 - classification_loss: 1.1316 - val_loss: 4.2032 - val_regression_loss: 3.0705 - val_classification_loss: 1.1327
Running network: N/A% (0 of 5) |         | Elapsed Time: 0:00:00 ETA:  --:--:--
Running network:  20% (1 of 5) |#        | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  40% (2 of 5) |###      | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  60% (3 of 5) |#####    | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network:  80% (4 of 5) |#######  | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network: 100% (5 of 5) |#########| Elapsed Time: 0:00:01 Time:  0:00:01
Parsing annotations: N/A% (0 of 5) |     | Elapsed Time: 0:00:00 ETA:  --:--:--
Parsing annotations: 100% (5 of 5) |#####| Elapsed Time: 0:00:00 Time:  0:00:00
5 instances of class OR0000000182019 with average precision: 0.0000
mAP: 0.0000
Epoch 8/500

1/1 [==============================] - 1s 1s/step - loss: 4.0253 - regression_loss: 2.8842 - classification_loss: 1.1410 - val_loss: 4.2032 - val_regression_loss: 3.0705 - val_classification_loss: 1.1327
Running network: N/A% (0 of 5) |         | Elapsed Time: 0:00:00 ETA:  --:--:--
Running network:  20% (1 of 5) |#        | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  40% (2 of 5) |###      | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  60% (3 of 5) |#####    | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network:  80% (4 of 5) |#######  | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network: 100% (5 of 5) |#########| Elapsed Time: 0:00:01 Time:  0:00:01
Parsing annotations: N/A% (0 of 5) |     | Elapsed Time: 0:00:00 ETA:  --:--:--
Parsing annotations: 100% (5 of 5) |#####| Elapsed Time: 0:00:00 Time:  0:00:00
5 instances of class OR0000000182019 with average precision: 0.0000
mAP: 0.0000

Epoch 00008: ReduceLROnPlateau reducing learning rate to 1.0000000116860975e-08.
Epoch 9/500

1/1 [==============================] - 1s 1s/step - loss: 4.1003 - regression_loss: 2.9671 - classification_loss: 1.1333 - val_loss: 4.2032 - val_regression_loss: 3.0705 - val_classification_loss: 1.1327
Running network: N/A% (0 of 5) |         | Elapsed Time: 0:00:00 ETA:  --:--:--
Running network:  20% (1 of 5) |#        | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  40% (2 of 5) |###      | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  60% (3 of 5) |#####    | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network:  80% (4 of 5) |#######  | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network: 100% (5 of 5) |#########| Elapsed Time: 0:00:01 Time:  0:00:01
Parsing annotations: N/A% (0 of 5) |     | Elapsed Time: 0:00:00 ETA:  --:--:--
Parsing annotations: 100% (5 of 5) |#####| Elapsed Time: 0:00:00 Time:  0:00:00
5 instances of class OR0000000182019 with average precision: 0.0000
mAP: 0.0000
Epoch 10/500

1/1 [==============================] - 1s 1s/step - loss: 3.9792 - regression_loss: 2.8452 - classification_loss: 1.1340 - val_loss: 4.2032 - val_regression_loss: 3.0705 - val_classification_loss: 1.1327
Running network: N/A% (0 of 5) |         | Elapsed Time: 0:00:00 ETA:  --:--:--
Running network:  20% (1 of 5) |#        | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  40% (2 of 5) |###      | Elapsed Time: 0:00:00 ETA:   0:00:01
Running network:  60% (3 of 5) |#####    | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network:  80% (4 of 5) |#######  | Elapsed Time: 0:00:01 ETA:   0:00:00
Running network: 100% (5 of 5) |#########| Elapsed Time: 0:00:02 Time:  0:00:02
Parsing annotations: N/A% (0 of 5) |     | Elapsed Time: 0:00:00 ETA:  --:--:--
Parsing annotations: 100% (5 of 5) |#####| Elapsed Time: 0:00:00 Time:  0:00:00
5 instances of class OR0000000182019 with average precision: 0.0000
mAP: 0.0000
hgaiser commented 4 years ago

Very strange.. is it possible for you to share this dataset?

tpostadjian commented 4 years ago

I cannot share more, it is not mine and not free :/

hgaiser commented 4 years ago

Since one image is already failing, is it possible for you to make the image completely black and have the objects be solid white? Basically the segmentation. Could you share that?

Or otherwise if you can create a new dataset with some very simple white-on-black-background objects, verify that training fails and share that?

tpostadjian commented 4 years ago

I am not sure of what you are asking: you want me to binarize the training images ? the whole dataset or just one image and observe what results of it ?

hgaiser commented 4 years ago

you want me to binarize the training images ?

^ yes

just one image and observe what results of it ?

^ yes

I'm not sure if that is okay for you to share, figured a binarized image doesn't hold much confidential information ;)

tpostadjian commented 4 years ago

OK, I kind of fixed although I am not sure why it works best.

I found out that loading pretrained weights on coco made the training of maskrcnn (at least on my project) successful :

model = keras_maskrcnn.models.backbone('resnet50').maskrcnn(num_classes=train_generator.num_classes())
model.load_weights('resnet50_coco_best_v2.1.0.h5', by_name=True, skip_mismatch=True)
model.compile(
        loss={
            'regression'    : keras_retinanet.losses.smooth_l1(),
            'classification': keras_retinanet.losses.focal(),
            'masks'         : losses.mask(),
        },
        optimizer=keras.optimizers.adam(lr=1e-5, clipnorm=0.001)
)

Training is behaving much more as what one usually expects from a training, with losses actually decreasing !

    1/12500 [..............................] - ETA: 139:16:57 - loss: 5.5880 - regression_loss: 1.5341 - classification_loss: 3.3609 - masks_loss: 0.6930
    2/12500 [..............................] - ETA: 72:43:28 - loss: 5.1688 - regression_loss: 1.7622 - classification_loss: 2.7137 - masks_loss: 0.6929 
    3/12500 [..............................] - ETA: 50:33:39 - loss: 5.1965 - regression_loss: 2.0356 - classification_loss: 2.4680 - masks_loss: 0.6929

...

  150/12500 [..............................] - ETA: 7:08:34 - loss: 4.2019 - regression_loss: 1.3782 - classification_loss: 2.1813 - masks_loss: 0.6424
  151/12500 [..............................] - ETA: 7:08:11 - loss: 4.1961 - regression_loss: 1.3763 - classification_loss: 2.1771 - masks_loss: 0.6427
  152/12500 [..............................] - ETA: 7:07:49 - loss: 4.1932 - regression_loss: 1.3764 - classification_loss: 2.1738 - masks_loss: 0.6430

...

 1835/12500 [===>..........................] - ETA: 5:32:44 - loss: 2.8596 - regression_loss: 0.8731 - classification_loss: 1.4586 - masks_loss: 0.5279
 1836/12500 [===>..........................] - ETA: 5:32:42 - loss: 2.8590 - regression_loss: 0.8727 - classification_loss: 1.4583 - masks_loss: 0.5280
 1837/12500 [===>..........................] - ETA: 5:32:40 - loss: 2.8589 - regression_loss: 0.8728 - classification_loss: 1.4581 - masks_loss: 0.5281

...

14887/14890 [============================>.] - ETA: 5s - loss: 1.8603 - regression_loss: 0.5800 - classification_loss: 0.8524 - masks_loss: 0.4279
14888/14890 [============================>.] - ETA: 3s - loss: 1.8602 - regression_loss: 0.5799 - classification_loss: 0.8523 - masks_loss: 0.4279
14889/14890 [============================>.] - ETA: 1s - loss: 1.8601 - regression_loss: 0.5799 - classification_loss: 0.8523 - masks_loss: 0.4279

Also, I got rid of all the callbacks from keras_maskrcnn / keras_retinanet and just used a basic checkpoint to save the best net, monitoring val_loss.