Some problems about baseline - Githubissues

alexrame / mixmo-pytorch

Official Pytorch implementation of MixMo framework

Other

84 stars 16 forks source link

Some problems about baseline #5

Closed ZzBros closed 3 years ago

ZzBros commented 3 years ago

When I was training the baseline network (useing exp_cifar10_wrn2810_1net_standard_bar1.yaml) , I want to know if I understand the following correctly?

mixmo uses random sampling (learning every sample once on one epoch)
mixmo uses DADataset
mixmo uses wideRsenet28-10
mixmo with warmup lr in first epoch and reduce to 0.1 * lr in [101,201,226] epoch
mixmo with L2 regular for net params
mixmo uses the specified initialization

I look forward to your comment if I have missed anything !!! Thanks!!!

alexrame commented 3 years ago

Yes, all this is true! Note that config/cifar10/exp_cifar10_wrn2810_1net_standard_bar1.yaml is a configuration file for a standard baseline network without data augmentation nor ensembling, so should not be called "mixmo" but rather "vanilla". In detail, the critical code sections for the different points are:

vanilla uses random sampling (learning every sample once on one epoch) https://github.com/alexrame/mixmo-pytorch/blob/5a2a1a090b7805212aaad8ebc36ef11ade75e8d2/mixmo/loaders/abstract_loader.py#L62
vanilla uses MSDADataset with msda_mix_method == None, which is strictly equivalent to DADataset https://github.com/alexrame/mixmo-pytorch/blob/5a2a1a090b7805212aaad8ebc36ef11ade75e8d2/mixmo/loaders/abstract_loader.py#L47
vanilla uses wide Resnet28-10 https://github.com/alexrame/mixmo-pytorch/blob/5a2a1a090b7805212aaad8ebc36ef11ade75e8d2/mixmo/networks/wrn.py#L54
vanilla with warmup lr in first epoch (=782 steps) and reduce to 0.1 * lr in [101,201,226] epoch https://github.com/alexrame/mixmo-pytorch/blob/5a2a1a090b7805212aaad8ebc36ef11ade75e8d2/config/cifar10/exp_cifar10_wrn2810_1net_standard_bar1.yaml#L69 https://github.com/alexrame/mixmo-pytorch/blob/5a2a1a090b7805212aaad8ebc36ef11ade75e8d2/mixmo/learners/model_wrapper.py#L80
vanilla with L2 regular for net params https://github.com/alexrame/mixmo-pytorch/blob/5a2a1a090b7805212aaad8ebc36ef11ade75e8d2/config/cifar10/exp_cifar10_wrn2810_1net_standard_bar1.yaml#L52 https://github.com/alexrame/mixmo-pytorch/blob/5a2a1a090b7805212aaad8ebc36ef11ade75e8d2/mixmo/core/loss.py#L176
vanilla uses the specified initialization https://github.com/alexrame/mixmo-pytorch/blob/5a2a1a090b7805212aaad8ebc36ef11ade75e8d2/mixmo/networks/resnet.py#L167

If you have any questions please let me know. Best regards