The loss drops sharply to negative after 2/3 iterations and it never recovers

lmb-freiburg / Unet-Segmentation

The U-Net Segmentation plugin for Fiji (ImageJ)

GNU General Public License v3.0

86 stars 25 forks source link

Hello,

recently I started using U-net for 2D cell detection and it works really well. However, sometimes when I want to finetune a model the loss drops sharpy to negative after 2/3 iterations and it never recovers. I use 1024x1024 image resolution. I use multipoint tool to annotate different cell types (usually two) and I use one class for colocalization. I always keep element size to 0.5 (I have feeling that this could be an issue, because I tried to change it a few times and I see some difference, but I'm not sure how to decide on best element size value). When I use values recommended when I click "from image" I still get the same issue with the sharp drop of loss to negative values.

It would be really nice if you could help me with this.

Thank you in advance!

What do you mean by negative? It's a softmax loss which is by definition positive. Must be a strange numerical issue. NaN's are possible also Infinity if you choose a too high learning rate. I never encountered negative values. For me it sounds like the input is corrupt but I could not tell what I had to do to the input images to get negative loss.

What is the raw element size of your images? Usually for tissues recorded with a light microscope 0.5 is a good starting point (approx. double the resolution limit). If you have large tissue areas and large cell bodies you can increase to 1 or more (as long as you can easily identify the cells by eye everything should be fine). For huge cells you might need to increase to fit them into the receptive field of the network (few hundred pixels depending on the number of network stages).

If you think your input images are fine, try to reduce the learning rate to 1E-5 or 1E-6. If training stabilizes, go on with these values.

lmb-freiburg / Unet-Segmentation

The loss drops sharply to negative after 2/3 iterations and it never recovers #57