Vanilla unet with valid/same/mirror padding

My guess is that padding in the convolutions caused these issues. When you use padding with convolutions, they are not shift-equivariant anymore, meaning that they can learn to incorporate absolute position of features into their predictions, thereby producing these tiling shadows mentioned in #68. The previous backbone was a ResNet50, when you turn off padding in the conv-layers the intermediate filter maps get smaller by 2 pixels in X and Y direction after each filter operation, thus the output will become really small (assuming a constant input image size). My plan is to use a regular U-Net (with less layers in the encoder than the other backbone), train it with reflective-padding. For inference, we'll use valid (i.e. no) padding, then it'll be provably translation equivariant and tiling shadows will be gone.

An alternative would be to optimize for equivariance by predicting two representations of the input data, one with shifted input and one without shifted input and enforce pixel-wise consistency after applying these transformations on the output. The latter approach doesn't have guarantees, while for the former method the equivalence of doing tile & stitch inference and predicting the whole image at once has been proven.

angelolab / Nimbus

Vanilla unet with valid/same/mirror padding #73