Prediction does not require the same image size as in training

PhucLee2605 commented 2 years ago

📚 Documentation

For now I have used ResUnet for my line segment part in table reconstruct project. The input image size for training is 256x256. When I pass an image (with original size is 535x269) to predict, there is no conflict about input size, even more, the prediction's output has the same size as the origin. I am wondering about that, can anyone give me an explanation?

PS: I have read the source code and see the resize part in transform for training input. Did I miss any line or any information?

HieuBui99 commented 2 years ago

In semantic segmentation models, the output should have the same size as the input. Which part are you struggling about?

PhucLee2605 commented 2 years ago

In semantic segmentation models, the output should have the same size as the input. Which part are you struggling about?

Yes, I got that semantic segmentation model will return output with same size as input. In case my previous question is not clear, we resize input images when training model (ex. 256x256) but at the prediction state, I don't have to resize it to the same size as training data before feed it to model. That confuses me. So again, why there is difference of input size between training and predicting state? For now, I have discovered that for a fully convolutional network, the input size is not important because it generates kernels. But the backbone of this repo's model is Resnet so is this why we have to resize before training?

HieuBui99 commented 2 years ago

You are right about fully convolutional networks as it can take input of any size. As to why we need to resize the training images, while training, input images are usually fed to the model in batches. So they need to be of the same size

sun-asterisk-research / table_reconstruction

Prediction does not require the same image size as in training #37

📚 Documentation