About training on half resolution cityscapes dataset

SJTUsuperxu commented 5 years ago

Hi @orsic , nice paper for semantic segmentation. Current I'm trying to reproduce the results in your paper. I notice that you demonstrate the model performance in (512, 1024): 70.2 miou and 134.9 fps。I wonder sth that is not clear in paper:

when training, do you resize the original image (1024,2048) to (512, 1024) first and then crop square (448, 448) patches for train ???
when testing, you also resize your image to (512, 1024) and just forward the image in your network?? No multi-scale testing is used ??? I'm looking forward to your reply. thx~

lxtGH commented 5 years ago

I have the same question with you

orsic commented 5 years ago

@SJTUsuperxu @lxtGH

Yes. The image is subsampled first, then a random 448x448 patch is cropped.
When testing, it is crucial to upsample the output back to 1024x2048 and evaluate on original labels from the dataset.

orsic / swiftnet

About training on half resolution cityscapes dataset #9