Describe the bug Why two test runs on MNIST test set with different random seeds return different values

To Reproduce

Train a model: mip-online-trainer --c configs/vision/simplecnn_mnist.yaml --li
Run tester twice e.g.: mip-tester --m ./experiments/MNIST/SimpleConvNet/20181108_132032/models/model_best.pt
Compare aggregated losses will be different

Expected behavior Exactly the same numbers when the same model is used on the same samples!

Additional context Tried changing the settings, so that I will make sure that the same samples are used:

For example, batch of size 100 when SubsetRandomSampler returns a subset of 100 samples

Still resulting in different losses, e.g. loss 0.2361298800 loss 0.2361298054 loss 0.2361298203

THOSE SHOULD BE EXACTLY THE SAME!

testing:

seed_numpy: 4354

seed_torch: 2452

dataloader: batch_sampler: null drop_last: true num_workers: 0 pin_memory: false shuffle: true timeout: 0 problem: batch_size: 100 name: MNIST resize:

32
32 use_train_data: false sampler: indices:
0
100 name: SubsetRandomSampler

IBM / mi-prometheus

Investigate why two test runs on the same model return different statistics #76

seed_numpy: 4354

seed_torch: 2452