mit-quest / necstlab-damage-segmentation

MIT License
5 stars 6 forks source link

set all random seeds for each workflow and instantiate image generator seeds #80

Closed rak5216 closed 3 years ago

rak5216 commented 3 years ago

yup, just saw this recently in here https://machinelearningmastery.com/reproducible-results-neural-networks-keras/ (edited)

Reed Kopp 22 hours ago going to: also, try putting this at the very top of the train script from numpy.random import seed seed(1) from tensorflow import set_random_seed set_random_seed(2)

Reed Kopp 22 hours ago if that doesn't work, then i think we will also need to make the GPU solve in a deterministic way (no issue here if using CPU tho). so then last thing will be (from above link): Randomness from Using the GPU All of the above examples assume the code was run on a CPU. It is possible that when using the GPU to train your models, the backend may be configured to use a sophisticated stack of GPU libraries, and that some of these may introduce their own source of randomness that you may or may not be able to account for. For example, there is some evidence that if you are using Nvidia cuDNN in your stack, that this may introduce additional sources of randomness and prevent the exact reproducibility of your results. (edited)

Joshua Joseph 22 hours ago for good measure I'd also just set the random module's seed in case too

Reed Kopp 22 hours ago agreed. i'm still confused with localization of seeded random modules and if/when they're overwritten. maybe can talk more tomorrow. i guess these seeded modules at the top of train script shouldn't conflict with those in the dataset generators which are instantiated :+1: 1

Reed Kopp 22 hours ago gonna leave this here for discussion tomorrow, involving independent random generators (instances of random) for each generator. trying to reconcile our simple syntax in image_utils.py, which almost seems like a re-seeding of the same (common to train and val generators) global Random instance of random from random import Random a = Random() b = Random() a.seed(0) b.seed(0) https://stackoverflow.com/questions/39537148/python-maintain-two-different-random-instance (edited)

Joshua Joseph 22 hours ago :thumbsup: (we may also want to migrate this thread into the channel so others can start seeing this stuff too)

Reed Kopp 22 hours ago will do, and carolina is documenting on gh issue too. just to add, i think to instantiate a numpy rng, yo uhave to: import numpy as np seed = 12345 rng = np.random.default_rng(seed) # can be called without a seed https://numpy.org/doc/1.18/reference/random/index.html#introduction so i think we've just been re-seeding the global seeds in both modules

Reed Kopp 21 hours ago last helpful (for me) link: https://albertcthomas.github.io/good-practices-random-number-generators/ (edited)

rak5216 commented 3 years ago

parallel with #76