jwyang / faster-rcnn.pytorch

A faster pytorch implementation of faster r-cnn
MIT License
7.66k stars 2.33k forks source link

pytorch pre-trained model, help! #556

Open xcwfawqdyz opened 5 years ago

xcwfawqdyz commented 5 years ago

I want to use pre-trained model from PyTorch to train a faster-rcnn. And I see:

if you want to use pytorch pre-trained models, please remember to transpose images from BGR to RGB, and also use the same data transformer (minus mean and normalize) as used in pretrained model

Could anyone indicates how to use Pytorch pretrained model in details please? Thanks sincerely in advance.

EMCP commented 5 years ago

I also wonder about BGR vs RGB... and how would I calculate "real" means.. i see all the models are using a standard mean so far, which feels incorrect.

AlexanderHustinx commented 5 years ago

I don't have any experience with using pretrained models from the PyTorch model zoo. But that information is available online, e.g. here: https://www.kaggle.com/pvlima/use-pretrained-pytorch-models

I also wonder about BGR vs RGB...

Regarding BGR vs. RGB, this is simply a case of how the color space is used by the model you load. It is common practice in Caffe, OpenCV and more to use BGR. While PyTorch models use RGB.

and how would I calculate "real" means..

When you train your model, it helps to make sure your model is zero-centered (or 0 mean). This creates stronger gradients and results in faster learning/convergence. I'm not 100% sure, but I believe that is what this "real mean" is that you are reffering to. A way to calculate that is by adding up all pixel values per color channel and dividing it by the number of pixels per color channel over the whole dataset. This value is then scaled to be in range [0,1].

Sometimes you will also see the standard deviation as well as the mean of the dataset. This further improved the model's learning performance, though it is said that when using the right non-linearities that this isn't entirely neccesary.

For more info on the subject I'll reffer you to this page of Stanford's course on Convolutional Neural Nets, about Input Normalization: http://cs231n.github.io/neural-networks-2/