RGB order and pixel mean/std?

tonysy commented 3 years ago

Hi, Thanks for your work. I have a question about this project. Do we need to change RGB, PIXEL_MEAN, PIXEL_STD of the configuration, to keep consistency with the original SwinTransformer?

xiaohu2015 commented 3 years ago

@tonysy the swinT models also use the ImageNet mean and std to normalize，that is the default config of detectron2

img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)

In detectron2, the RGB order is defined in _C.INPUT.FORMAT, so that should not be a trouble. But the std I use the default [1.0, 1.0, 1.0]. The right way is to use [57.375, 57.120, 58.395], but the result seems not be affected serverly.

# Values to be used for image normalization (BGR order, since INPUT.FORMAT defaults to BGR).
# To train on images of different number of channels, just set different mean & std.
# Default values are the mean pixel value from ImageNet: [103.53, 116.28, 123.675]
_C.MODEL.PIXEL_MEAN = [103.530, 116.280, 123.675]
# When using pre-trained models in Detectron1 or any MSRA models,
# std has been absorbed into its conv1 weights, so the std needs to be set 1.
# Otherwise, you can use [57.375, 57.120, 58.395] (ImageNet std)
_C.MODEL.PIXEL_STD = [1.0, 1.0, 1.0]

# Whether the model needs RGB, YUV, HSV etc.
# Should be one of the modes defined here, as we use PIL to read the image:
# https://pillow.readthedocs.io/en/stable/handbook/concepts.html#concept-modes
# with BGR being the one exception. One can set image format to BGR, we will
# internally use RGB for conversion and flip the channels over
_C.INPUT.FORMAT = "BGR"

tonysy commented 3 years ago

Thanks for your reply. I think changing the INPUT.FORMAT to 'RGB', and change the order of the MEAN and STD, correspondingly, will be better.

xiaohu2015 commented 3 years ago

@tonysy Yes, you are right. thanks.

xiaohu2015 / SwinT_detectron2

RGB order and pixel mean/std? #1