google-research / big_vision

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
Apache License 2.0
2.25k stars 147 forks source link

Mixup Per Example? #90

Open dibyaghosh opened 8 months ago

dibyaghosh commented 8 months ago

Hi! I was wondering why the implementation of mixup uses a single sampled $a$ per batch as opposed to using a different sample $a$ per batch element. Intuitively, it seems that doing this should lead to higher variance in the optimization process.

https://github.com/google-research/big_vision/blob/474dd2ebde37268db4ea44decef14c7c1f6a0258/big_vision/utils.py#L1086