eaplatanios / tensorflow_scala

TensorFlow API for the Scala Programming Language
http://platanios.org/tensorflow_scala/
Apache License 2.0
936 stars 96 forks source link

Locking Mechanisms (CriticalSection, use_locking=true, ...) #117

Closed DirkToewe closed 6 years ago

DirkToewe commented 6 years ago

Accumulating gradients in Variables is a great way of reducing memory overhead. Doing so in parallel greatly speeds up things. It would be great to have access to some more locking mechanisms like CriticalSection or the use_locking parameter for Variable assignment.

eaplatanios commented 6 years ago

I can look into adding support for CriticalSection, but use_locking should not be necessary because we use resource variables (as opposed to the old-style reference variables that the Python API uses by default), which should be thread-safe by default (i.e., use_locking is not even an available option when manipulating them). Other than that, what else do you think you'd need to accumulate gradients in variables? Because currently, the back-propagation code is not really customizable and I guess you'd have to call tf.gradients multiple times yourself, in order to split up the computation. Or I may not be understanding the question well, so sorry about that. :)

DirkToewe commented 6 years ago

If assignAdd is already thread-safe, I have what I need already! CriticalSection would just be nice to have.

Using assignAdd, it's easy enough to build a graph that adds gradients to a variable and do parallel Session.run calls on it, assuming, of course, Session.run() is thread-safe (it is in the C++ API).

eaplatanios commented 6 years ago

Yes, assignAdd should already be thread-safe, as should Session.run(). There may be issues related to modifying the graph after creating the session, as I haven't tested that thoroughly, but I personally have not run into any issues. I'll close this for now since there seems to be support for what you need, but I'll keep CriticalSection in my TODO list. :)