Closed WardLT closed 5 years ago
Note that this PR uses features from #3.
These are fantastic, thanks for the contribution! Should I merge this one and close #3?
No problem. Thanks for making this open source! 😄
Sure, merging this and closing #3 would work for me.
Interesting, looks like github was smart enough to merge both :)
This PR adds support for ensuring that each replica has different training data during data-parallel training. This is accomplished by setting the same random seed for each
GraphSequence
and shifting the training data after shuffling.