Closed bnels closed 5 years ago
This is happening because we create the iterator of RandomSequenceSampler (i.e. call the method RandomSequenceSampler.__iter__
) after the model creation, at the first call of type next(dataloader)
. Hence if the model definition is different, the RNG will be called a different number of times, and by the time we reach RandomSequenceSampler.__iter__
the output will look different.
The immediate mitigation is to make sure RandomSequenceSampler.__iter__
is called before model creation.
nice detective work!
An observation: If I take the same model, and run with the same seed twice, I will see the same events and see the same training behavior If I take different models and run with the same seed twice I will see different events
This is likely due to the seed being set once, then
It seems like we may wish to have separate seeds for model creation and training data ordering so we can know if we're seeing the same events in the same order across models (or if we change model definition)
i.e.
for the sampler
and
for model creation