facebookresearch / fairseq2

FAIR Sequence Modeling Toolkit 2
https://facebookresearch.github.io/fairseq2/
MIT License
678 stars 78 forks source link

Improve shuffle handling in datasets #573

Closed cbalioglu closed 3 months ago

cbalioglu commented 3 months ago

This PR seperates shuffle_windows_size into example_shuffle_window and batch_shuffle_window for finer grained control of how shuffling is performed in datasets.