jakeret / tf_unet

Generic U-Net Tensorflow implementation for image segmentation
GNU General Public License v3.0
1.9k stars 748 forks source link

Output dimension #183

Closed AziziShekoofeh closed 6 years ago

AziziShekoofeh commented 6 years ago

Hi,

I am trying to use your package for training a unet model on a dataset includes jpeg images with the dimensions of 128x128x3 and the binary mask of 128x128x2. The output which I can read from the network as the logits (node "div_1") has the dimension of 36x36x2. These logits then after being used as the input to the pixel_wise_softmax_2 layer to generate the "predictor" and finally to calculate the error rate. To align the size of the ground-truth and the prediction you also used crop_to_shape which is also a bit weird to me. As I saw in many other implementations unet supposed to generate the same size result. Any insight to this problem or how to solve it? I expect to see an upsampling which I couldn't find it.

Best, Shekoofeh

jakeret commented 6 years ago

The upsampling is implemented here. However, as the original Ronneberger et al. paper explains, it is expected that the resulting prediction is smaller than the input image. One approach to address this to mirror the edges of the input. There are many explanations on how to do that if you look thru the issues