matterport / Mask_RCNN

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Other
24.63k stars 11.7k forks source link

Tensorflow 2 loss issue #2514

Open lovehell opened 3 years ago

lovehell commented 3 years ago

Hi,

I've managed to train a model with tensorflow 2.x using this wonderful repo: https://github.com/akTwelve/Mask_RCNN

However I've noticed that when doing multistep training, the accumulated loss is not equal to the sum of the losses when changing the learning rate (i.e. loading the model again for a smaller learning rate). It is actually equal to the sum of the losses by a factor of two.

It seems that in most of tf2 repos the lines in model.py are removed:

self.keras_model._losses = []
self.keras_model._per_input_losses = {}

This is because these are private variables. However they are supposed to clear the previous losses (set during previous training step). Hence in tf2 implementations the losses are accumulated twice so summed loss is x2

I'm trying to find a way to reset the losses when performing the second training step Any help would be appreciated

Thanks !

AndySung320 commented 3 years ago

I also notice the same problem, but haven't find any solution except interrupting the training and then re-start