Open yael-vinker opened 3 years ago
The problem arises because of data race between threads. Because of the limited precision, addition of float numbers is not associative, i.e. (a + b) + c != a + (b + c) Changing the number of threads in the code to one is a temporary fix for this problem.
Hi, thanks a lot for providing this public implementation.
I am trying to achieve a deterministic training for reproducibility. In lines 40-41 in "painterly_rendering.py", you defined:
However, when running exactly the same command twice, using the same machine and environment, I get different loss values in the first iterations: command used:
python painterly_rendering.py imgs/baboon.png --num_iter 3
output, 1:
output, 2:
you can see that in the second iteration the losses are different:
0.272503137588501 and 0.2725030183792114
Is there a way to fix that in order to achieve consistent results during training?Thanks