Closed ZzBros closed 3 years ago
Yes, all this is true! Note that config/cifar10/exp_cifar10_wrn2810_1net_standard_bar1.yaml is a configuration file for a standard baseline network without data augmentation nor ensembling, so should not be called "mixmo" but rather "vanilla". In detail, the critical code sections for the different points are:
vanilla uses random sampling (learning every sample once on one epoch) https://github.com/alexrame/mixmo-pytorch/blob/5a2a1a090b7805212aaad8ebc36ef11ade75e8d2/mixmo/loaders/abstract_loader.py#L62
vanilla uses MSDADataset with msda_mix_method == None, which is strictly equivalent to DADataset https://github.com/alexrame/mixmo-pytorch/blob/5a2a1a090b7805212aaad8ebc36ef11ade75e8d2/mixmo/loaders/abstract_loader.py#L47
vanilla uses wide Resnet28-10 https://github.com/alexrame/mixmo-pytorch/blob/5a2a1a090b7805212aaad8ebc36ef11ade75e8d2/mixmo/networks/wrn.py#L54
vanilla with warmup lr in first epoch (=782 steps) and reduce to 0.1 * lr in [101,201,226] epoch https://github.com/alexrame/mixmo-pytorch/blob/5a2a1a090b7805212aaad8ebc36ef11ade75e8d2/config/cifar10/exp_cifar10_wrn2810_1net_standard_bar1.yaml#L69 https://github.com/alexrame/mixmo-pytorch/blob/5a2a1a090b7805212aaad8ebc36ef11ade75e8d2/mixmo/learners/model_wrapper.py#L80
vanilla with L2 regular for net params https://github.com/alexrame/mixmo-pytorch/blob/5a2a1a090b7805212aaad8ebc36ef11ade75e8d2/config/cifar10/exp_cifar10_wrn2810_1net_standard_bar1.yaml#L52 https://github.com/alexrame/mixmo-pytorch/blob/5a2a1a090b7805212aaad8ebc36ef11ade75e8d2/mixmo/core/loss.py#L176
vanilla uses the specified initialization https://github.com/alexrame/mixmo-pytorch/blob/5a2a1a090b7805212aaad8ebc36ef11ade75e8d2/mixmo/networks/resnet.py#L167
If you have any questions please let me know. Best regards
When I was training the baseline network (useing exp_cifar10_wrn2810_1net_standard_bar1.yaml) , I want to know if I understand the following correctly?
I look forward to your comment if I have missed anything !!! Thanks!!!