qubvel / segmentation_models

Segmentation models with pretrained backbones. Keras and TensorFlow Keras.
MIT License
4.72k stars 1.03k forks source link

about preprocessing #277

Open LeeJayHui opened 4 years ago

LeeJayHui commented 4 years ago

Hi, I am wondering that what "sm.get_preprocessing" actually do. For example, assume we are using BACKBONE = 'resnet34' preprocess_input = sm.get_preprocessing(BACKBONE) I trace the code back to line 385 (somewhere near it) in "/my_virtual_environment/lib/python3.6/site-packages/classification_models/models/resnet.py". And it occurs to me that the preprocess function appears as : def preprocess_input(x, **kwargs): return x. It means that the preprocess_input = sm.get_preprocessing(BACKBONE) did actually nothing. I am not so sure if it is the truth. But here is my question. If it actually works like I say aboved, it is really necessary to give detailed introduction of the preprocessing functions which were used when pretraing the networks on imagenet. And it is important that we know the scaling of the inputs ([0-255] or [-1,1] or others). Hope you can help me. Your work is excellent.

JordanMakesMaps commented 4 years ago

It's because Resnet (the original backbone that was trained on ImageNet) didn't have any pre-processing/scaling. So in this case, preprocess_input(backbone = 'resnet34'), really doesn't do anything. But for other backbones the pre-processing/scaling might be between [0, 255], [-1, 1], [0, 1], or some subtraction of the mean pixel value.

It is important to pre-process the images if you're using imagenet weights with the backbone of your choice. For example, If you use VGG w/ imagenet weights, it is expecting some range of values of the images it is given. If you all the sudden provide it with a different range of values, it's not going to do well at all. I believe that if you're training some backbone from scratch, then you can use whatever pre-processing methods you want. But again, if using imagenet weights you should do what the original author of the code/architecture did to get the best results.

LeeJayHui commented 4 years ago

It's because Resnet (the original backbone that was trained on ImageNet) didn't have any pre-processing/scaling. So in this case, preprocess_input(backbone = 'resnet34'), really doesn't do anything. But for other backbones the pre-processing/scaling might be between [0, 255], [-1, 1], [0, 1], or some subtraction of the mean pixel value.

It is important to pre-process the images if you're using imagenet weights with the backbone of your choice. For example, If you use VGG w/ imagenet weights, it is expecting some range of values of the images it is given. If you all the sudden provide it with a different range of values, it's not going to do well at all. I believe that if you're training some backbone from scratch, then you can use whatever pre-processing methods you want. But again, if using imagenet weights you should do what the original author of the code/architecture did to get the best results.

Thank you. Now I have a better understanding about what I should do. Your answer is great.

rdutta1999 commented 4 years ago

I noticed that, while using ResNet backbone, sm.get_preprocessing("resnet34") returns a pre-processing function that is present in classification_models/models/resnet.py, and as @JordanMakesMaps said, the function simply returned x because, ResNet didn't use any scaling while training on the ImageNet dataset. This, however, is not the case with a VGG16/19 backbone, in which case, sm.get_preprocessing("vgg16") uses the keras_applications.imagenet_utils.preprocess_input() with mode = 'caffe'. This made me wonder, why the ResNet based Backbone doesn't use the ResNet pre_processing function that comes with Keras, like it does with VGG16/19. Another thing, I noticed is that, while creating a ResNet50 model using Keras.applications, it uses its own preprocessing function that also calls the same imagenet_utils.preprocess_input() but with mode = 'tf'.

As a result, the ResNet models in Segmentation Model package and Keras package, are using different pre-processings.

OutSorcerer commented 4 years ago

Moreover, Unet in segmentation_models never calls get_preprocessing it only calls get_backbone, which seems to be a bug, because in Unet a frozen encoder that requires a non-trivial preprocessing will produce wrong encodings.

qubvel commented 4 years ago

Use get_preprocessing and preprosessing_fn for your data is your responsibility (see examples)