What's the input format of the fasterrcnn_resnet50_fpn? I mean RGB or BGR.

pytorch>=1.1

I notice that both the RGB and BGR input of [n,c,h,w] can get a good result(BGR is slightly higher).

model = fasterrcnn_resnet50_fpn(pretrained=True)
model.eval()

## RGB
img1 = Image.open('image1.jpg')
## BGR
img2 = np.array(img1)[:, :, [2, 1, 0]].copy()

x1= [transforms.ToTensor()(img1)]
x2= [transforms.ToTensor()(img2)]

predictions1 = model(x1)
predictions2 = model(x2)

It seems that predictions2 is better. So, should I use the BGR format to fine-tuning and eval ? I can't find this information in the code and I only know the size is [n,c,h,w]. In the config of the detectron2 of facebook, it says

# Values to be used for image normalization (BGR order).
# To train on images of different number of channels, just set different mean & std.
# Default values are the mean pixel value from ImageNet: [103.53, 116.28, 123.675]
_C.MODEL.PIXEL_MEAN = [103.530, 116.280, 123.675]

So BGR is the one we should choose?

pytorch / vision

What's the input format of the fasterrcnn_resnet50_fpn? I mean RGB or BGR. #1608

pytorch>=1.1