asteroid-team / torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
MIT License
928 stars 87 forks source link

Don't interfere with global seed, using optional torch.generator support #170

Closed turian closed 9 months ago

turian commented 9 months ago

torch-audiomentations should support torch.generator optionally, instead of always using the global seed.

Libraries that use pseudo-randomness and always interfere with the global seed are an anti-pattern, because they impede reproducibility.

A key use-case is this: You have validation set that is unrealistically "clean", but you want to add augmentations to simulate your test environment, which is more diverse. However, you want to deterministically introduce augmentations, so the validation set is the same every time.

The only workaround now is to with mucking with your global seed (that could have unintended consequences, and is easy to footgun on). The best solution is that torch-audiomentations can, optionally, use randomness independently from the global environment.

turian commented 9 months ago

I forgot that one can obtain the global RNG state, then deterministically seed before augmentation, then recover the global RNG state.