Closed PhucLee2605 closed 2 years ago
In semantic segmentation models, the output should have the same size as the input. Which part are you struggling about?
In semantic segmentation models, the output should have the same size as the input. Which part are you struggling about?
Yes, I got that semantic segmentation model will return output with same size as input. In case my previous question is not clear, we resize input images when training model (ex. 256x256) but at the prediction state, I don't have to resize it to the same size as training data before feed it to model. That confuses me. So again, why there is difference of input size between training and predicting state? For now, I have discovered that for a fully convolutional network, the input size is not important because it generates kernels. But the backbone of this repo's model is Resnet so is this why we have to resize before training?
You are right about fully convolutional networks as it can take input of any size. As to why we need to resize the training images, while training, input images are usually fed to the model in batches. So they need to be of the same size
📚 Documentation
For now I have used ResUnet for my line segment part in table reconstruct project. The input image size for training is 256x256. When I pass an image (with original size is 535x269) to predict, there is no conflict about input size, even more, the prediction's output has the same size as the origin. I am wondering about that, can anyone give me an explanation?
PS: I have read the source code and see the resize part in transform for training input. Did I miss any line or any information?