Closed valencebond closed 4 years ago
Actually, two fully connected layer could be merged into one for simplicity.
Please refer to the formula below:
thanks~ that means manifold mixup by concat is a better way than by originally sum ?what‘s more,why the scale 2 is needed?
mixed_feature = 2 torch.cat((l feature_a, (1 - l) * feature_b), dim=1)
Scale 2 is to ensure the grad consistent with the default combiner. (e.g. two samplers sample the same picture, and l = 0.5)
thanks~ that means manifold mixup by concat is a better way than by originally sum ?what‘s more,why the scale 2 is needed?
mixed_feature = 2 torch.cat((l feature_a, (1 - l) * feature_b), dim=1)
Hi @valencebond @ZhouBoyan Is this equivalent to that mentioned in the paper or concat method is better performing than original sum.
Thanks
the corresponding code is
according to the code, the target introduced in section 4.3 may can not achieve, as the feature is concatenated followed by only one classifer.
would you mind telling me the reason behind this change?