DeepLearnPhysics / lartpc_mlreco3d

9 stars 32 forks source link

Seed Behavior #27

Closed bnels closed 5 years ago

bnels commented 5 years ago

An observation: If I take the same model, and run with the same seed twice, I will see the same events and see the same training behavior If I take different models and run with the same seed twice I will see different events

This is likely due to the seed being set once, then

  1. using the random number generator for model creation
  2. using the random number generator for determining event ordering

It seems like we may wish to have separate seeds for model creation and training data ordering so we can know if we're seeing the same events in the same order across models (or if we change model definition)

i.e.

sampler:
    seed: 0
    name: RandomSequenceSampler
    batch_size: 8

for the sampler

and

model:
    seed: 0

for model creation

Temigo commented 5 years ago

This is happening because we create the iterator of RandomSequenceSampler (i.e. call the method RandomSequenceSampler.__iter__) after the model creation, at the first call of type next(dataloader). Hence if the model definition is different, the RNG will be called a different number of times, and by the time we reach RandomSequenceSampler.__iter__ the output will look different.

The immediate mitigation is to make sure RandomSequenceSampler.__iter__ is called before model creation.

drinkingkazu commented 5 years ago

nice detective work!

drinkingkazu commented 5 years ago

https://github.com/DeepLearnPhysics/lartpc_mlreco3d/pull/41