Closed Mozijie255 closed 2 years ago
I think I got it...it seems that the input img's size must be able to be divided by 16 due to the setting of downsampling, otherwise there will be some inconsistency during upsampling
@Mozijie255 Yes, you're right! It depends on how many times the max-pooling operation is used in the encoder part. So, 4 times => the image size must be divided by 2^4.
Anybody come across this problem while using random input picture to generate density map through M-SegNet? why the shape[2] of these 3 tensor differ?