kenshohara / 3D-ResNets-PyTorch

3D ResNets for Action Recognition (CVPR 2018)
MIT License
3.9k stars 931 forks source link

Why using activitynet's mean to normalize? #123

Open tanxjtu opened 5 years ago

tanxjtu commented 5 years ago

question 1: we need to train kinetices model first, however why using activitynet dataset mean to normlize? in opts.py the "meandataset' is 'activitynet' by default, In the next the opt.mean will be [114,107,99]? Is it right? or I may misunderstand the code? question 2: the normalization function is for t, m, s in zip(tensor, self.mean, self.std): t.sub(m).div_(s) that the mean is one of the ([114.7, 107.7, 99.4]{activitynet} / [110.6, 103.1, 96.2]{ kinetices}) and the std is [1,1,1],. the function I understand is right? Because I have no GPU to run code,recently!
Thanks!

cmhzc commented 3 years ago

I'm really confused by the answer kenshohara provided in #12 and #7

guilhermesurek commented 3 years ago

Maybe these comments are outdated. In the recent update (April/2020) kenshohara make available this new pretrained weights and recommended to just use those flags. image He did not metion any mean or normalization, so I assumed that the default is the one. In the opts.py file:

    parser.add_argument('--mean_dataset',
                        default='kinetics',
                        type=str,
                        help=('dataset for mean values of mean subtraction'
                              '(activitynet | kinetics | 0.5)'))
    parser.add_argument('--no_mean_norm',
                        action='store_true',
                        help='If true, inputs are not normalized by mean.')
    parser.add_argument(
        '--no_std_norm',
        action='store_true',
        help='If true, inputs are not normalized by standard deviation.')
    parser.add_argument(
        '--value_scale',
        default=1,
        type=int,
        help=
        'If 1, range of inputs is [0-1]. If 255, range of inputs is [0-255].')

Kinetics mean is the default, followed by value_scale = 1. Then in the main.py file get_opt() function:

opt.mean, opt.std = get_mean_std(opt.value_scale, dataset=opt.mean_dataset)

And in the mean.py file:

def get_mean_std(value_scale, dataset):
    assert dataset in ['activitynet', 'kinetics', '0.5']

    if dataset == 'activitynet':
        mean = [0.4477, 0.4209, 0.3906]
        std = [0.2767, 0.2695, 0.2714]
    elif dataset == 'kinetics':
        mean = [0.4345, 0.4051, 0.3775]
        std = [0.2768, 0.2713, 0.2737]
    elif dataset == '0.5':
        mean = [0.5, 0.5, 0.5]
        std = [0.5, 0.5, 0.5]

    mean = [x * value_scale for x in mean]
    std = [x * value_scale for x in std]

    return mean, std
cmhzc commented 3 years ago

Maybe these comments are outdated. In the recent update (April/2020) kenshohara make available this new pretrained weights and recommended to just use those flags. image He did not metion any mean or normalization, so I assumed that the default is the one. In the opts.py file:

    parser.add_argument('--mean_dataset',
                        default='kinetics',
                        type=str,
                        help=('dataset for mean values of mean subtraction'
                              '(activitynet | kinetics | 0.5)'))
    parser.add_argument('--no_mean_norm',
                        action='store_true',
                        help='If true, inputs are not normalized by mean.')
    parser.add_argument(
        '--no_std_norm',
        action='store_true',
        help='If true, inputs are not normalized by standard deviation.')
    parser.add_argument(
        '--value_scale',
        default=1,
        type=int,
        help=
        'If 1, range of inputs is [0-1]. If 255, range of inputs is [0-255].')

Kinetics mean is the default, followed by value_scale = 1. Then in the main.py file get_opt() function:

opt.mean, opt.std = get_mean_std(opt.value_scale, dataset=opt.mean_dataset)

And in the mean.py file:

def get_mean_std(value_scale, dataset):
    assert dataset in ['activitynet', 'kinetics', '0.5']

    if dataset == 'activitynet':
        mean = [0.4477, 0.4209, 0.3906]
        std = [0.2767, 0.2695, 0.2714]
    elif dataset == 'kinetics':
        mean = [0.4345, 0.4051, 0.3775]
        std = [0.2768, 0.2713, 0.2737]
    elif dataset == '0.5':
        mean = [0.5, 0.5, 0.5]
        std = [0.5, 0.5, 0.5]

    mean = [x * value_scale for x in mean]
    std = [x * value_scale for x in std]

    return mean, std

Thanks, but actually I'm using the old pretrained weights with my own scripts, not the weights in the 2020 update.

According to their CVPR paper, they used activitynet's mean.

sakh251 commented 2 years ago

Hi, @guilhermesurek Thank you for your point.

So how we should calculate the mean and std when the model is trained on two datasets. For example, they have pretrained model on kinetics and moments in time. For inference how can we normalize the new data? there are some statistics about how we can calculate the mean and std of merged sets which we should consider them as the independent sets. Thank you.

87003697 commented 2 years ago

这是来自QQ邮箱的假期自动回复邮件。   您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。