google-research / scenic

Scenic: A Jax Library for Computer Vision Research and Beyond
Apache License 2.0
3.14k stars 417 forks source link

TF Preprocessing / Dataloading randomness #1080

Open girishvn opened 2 weeks ago

girishvn commented 2 weeks ago

Hi all,

Thanks for the great repo! I've noticed that this repo uses tensorflow data preprocessing / data-loading (tfds, etc.). Is there a reason the tf global random seed is never set (via tf.random.set_seed())?

Is there a way to ensure reproducible randomness, with respect to shuffle order, augmentation, etc.? How is this handled if training is distributed across multiple devices (TPUs or GPUs)? Or if this randomness is already reproducible any help understanding how this is implemented / handled would be greatly appreciated!

Thanks!