use_locking: If True, updating of the var, ms, and mom tensors is protected by a lock; otherwise the behavior is undefined, but may exhibit less contention.
However in the code this flag is set to False. Could this cause a problem by the racing condition?
Also, I don't understand why the original paper states it's better to share g across different threads. Is there any reason to justify this other than empirical evidences?
In tensorflow document, it says:
use_locking: If True, updating of the var, ms, and mom tensors is protected by a lock; otherwise the behavior is undefined, but may exhibit less contention.
However in the code this flag is set to False. Could this cause a problem by the racing condition?
Also, I don't understand why the original paper states it's better to share g across different threads. Is there any reason to justify this other than empirical evidences?