carpentries-incubator / intro-image-classification-cnn

new lesson on image classification with convolutional neural networks
https://carpentries-incubator.github.io/intro-image-classification-cnn/
Other
2 stars 3 forks source link

Reduce reproducibility among architectures #51

Open erinmgraham opened 5 months ago

erinmgraham commented 5 months ago

Shern: 3 broad sources of non-reproducibility:

  1. Optimized parallelized floating point math doesn’t always return the same answer for long sums (e.g. “0.1 + 0.3 + 0.2” is not the same as “0.2 + 0.3 + 0.1”). Not much doable besides recompiling from source without optimizations 2.The model’s weights are initialized randomly. This means its subsequent “journey” towards optimization will also be random. This can be overcome (?) by setting the initializer seed. 3.ADAM is “stochastic” in the order that mini-batches are selected from the training set. This can be overcome by setting shuffle = False in the fit() function.

See suggested fixes in Googledoc from trial.

Andrew: Including keras.utils.set_random_seed(42)immediately prior to the model setup should fix the random number generators and increase reproducibility