Closed seanpmorgan closed 5 years ago
I think as a general rule, we should use it wherever possible unless there's a clear reason not to.
Some reasons not to, off the top of my head:
Barring those, function-ify away. And, try to avoid the above scenarios if possible. (For example, in Keras, we have the object store a "built" attr such that we only create vars on the first call, and subsequent calls don't recreate attrs.)
(To clarify the second point there, Python control flow in general is fine, but there are certain edge cases where, once we trace the tf.function, it will stop being dynamic each time you call the function. For example, if you made a decision based on time or some Python object outside the function and not passed into the function. In those cases, the function would great a bit-of-graph based on the first calling, but would not regenerate with the most up-to-date Python values at each calling. That said, different values of inputs cause the function to be re-traced, so if the dynamic value in question were an input to the tf.function, you would be okay.)
Thanks! Closing issue as it can be re-opened if there are more thoughts/questions
So re-opening this issue for a bit just to continue the discussion. Scott has mentioned that hes hesitant to decorate some seq2seq stuff with tf.function for fear of debug-ability loss (See #72 review) .
We have noticed a possible issue in #94 that may be related to the liberal decorating, but I wanted to discuss our strategy a bit. It's certainly possible that in #94 we're seeing just a bug or some improper setup of the function for using autograph... but it does seem to have the potential to make troubleshooting more difficult.
@karmel @martinwicke @facaiy Should we consider letting the user decide when to decorate addon functions? That might make it easier in the short term, but I expect we'll field a lot of issues once users try to do just that.
@alextp FYI
Debugging alone shouldn't be grounds for not decorating. for debugging etc. you can use tf.config.experimental_run_functions_eagerly
(which temporarily disables running functions as functions).
As the API author, my take on this is that we shouldn't annotation python function with tf.functions by default, instead, we should let user annotation their function and call ours. With that our function will proper inherit the context (eager/graph) based on user's current context.
@alextp, is there any reason to enable tf.function by default from API perspective?
@qlzh727 I think using tf.function in an API implementation, if safe, is a good way to help people get performance by default (both for graph building, as graphs with functions are smaller and quicker to build, and for eager execution).
Ack, one of side effect I am thinking of is that for any mathematical computation related python function, user kind of losses the ability to debug the tensor value if the function is wrapped by tf.function by default.
Which is why we have the execute_functions_eagerly thing.
On Tue, Mar 19, 2019 at 3:43 PM Qianli Scott Zhu notifications@github.com wrote:
Ack, one of side effect I am thinking of is that for any mathematical computation related python function, user kind of losses the ability to debug the tensor value if the function is wrapped by tf.function by default.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/addons/issues/13#issuecomment-474613122, or mute the thread https://github.com/notifications/unsubscribe-auth/AAATxbAGUbJzbwZ6j9bSN-YKGnSKoLYMks5vYWgUgaJpZM4Z_QTu .
--
Oh, do u mean "tf.experimental.function_executor_type"? I didn't find "tf.execute_functions_eagerly" in the API.
tf.config.experimental_run_functions_eagerly
On Wed, Mar 20, 2019 at 7:26 AM Qianli Scott Zhu notifications@github.com wrote:
Oh, do u mean "tf.experimental.function_executor_type"? I didn't find "tf.execute_functions_eagerly" in the API.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/addons/issues/13#issuecomment-474853907, or mute the thread https://github.com/notifications/unsubscribe-auth/AAATxUHv4sBb3ZoksOhcfzxmXL4ss9Q_ks5vYkUVgaJpZM4Z_QTu .
--
Ack, then I think its fine to annotation all the public method that handles tensor/variables with tf.function.
I think its fine to annotation all the public method that handles tensor/variables with tf.function.
+1 . And I'm wondering how to deal with those private methods? We once discussed it in https://github.com/tensorflow/addons/pull/35 without conclusion.
Good question. My take on this is from performance perspective.
Lets say we have func a, b, c and a -> b -> c (-> here means invoke). If all of them are wrapped with individual tf.function, then all of them will generate its own func_def on the graph level. On the other hand, if only a has tf.function, then a, b and c will be wrapped in one single func_def. Performance wise, I think there is a small penalty for every level of func_def, more over, we have grappler optimizer that will traverse the func_def body and do optimization. A lot of small function might cause it to fail to detect certain cases since those ops are isolated between func_defs.
Having said that all, if we consider another example, you have a -> d, b -> d, and c -> d. Wrapping d with tf.function might actually be a good idea even it is a private method. If a, b and c all calls d with same tensor shape, d will only generate one func_def, which will save a lot of duplication, comparing to only wrap tf.function on a, b and c.
So I don't think there is a golden rule here. @alextp do you have any thoughts?
@qlzh727 No, actually. Using nested tf.functions actually helps graph building time if a function is called more than once in your graph.
Plus at execution time we can in principle (and in practice) inline all function calls and not pay any dispatch overhead.
So I'd rather have a graph with functions in it as it's smaller (and hence faster to build, serialize, etc) and more semantically meaningful.
Ah, I see, thanks Alex. Apparently grappler function_optimizer does all the heavy lifting for inlining the functions, which means my argument of a -> b -> c above is not valid. I think then it is quite safe to annotate all the functions with tf.function if they handle tensor/variable.
@facaiy, You can take a look for https://github.com/tensorflow/tensorflow/blob/e41c382ffc16ed26d10e775e4facd0c641b07234/tensorflow/core/grappler/optimizers/function_optimizer.cc if you are interested.
Thank you for the input, @qlzh727 @alextp :-)
I've read this thread and many other many times but there's something I still don't understand.
Should the call
function of custom layers be decorated with the tf.function
decorator?
For example :
class CustomLayer(Layer):
def build(self, input_shape):
self.w1 = self.add_weight('w1', shape=(input_shape))
def call(self, inputs):
return tf.matmul(inputs, tf.transpose(self.w1))
Should the call
function be decorated?
Because I looked at examples on the tensorflow.models repo and the call
functions aren't decorated (for example : https://github.com/tensorflow/models/blob/master/official/vision/beta/modeling/layers/nn_blocks_3d.py)
tf.config.experimental_run_functions_eagerly … On Wed, Mar 20, 2019 at 7:26 AM Qianli Scott Zhu @.***> wrote: Oh, do u mean "tf.experimental.function_executor_type"? I didn't find "tf.execute_functions_eagerly" in the API. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#13 (comment)>, or mute the thread https://github.com/notifications/unsubscribe-auth/AAATxUHv4sBb3ZoksOhcfzxmXL4ss9Q_ks5vYkUVgaJpZM4Z_QTu . -- - Alex
But, how would users (developers) know to turn that option on? Developers would be burdened with having to know something about the internal API implementation details.
Are there any best practices for when to decorate functions with tf.function? The TF2 documentation has a nice example where a keras Model is decorating the
call
method.@karmel @martinwicke Should we use this extensively in our repo anytime there is a method without python side-effects? Is it safer to only utilize it when there are clear performance gains by exploiting parallelism or other optimizations? The latter seems like a vague criteria without specifying what patterns to look for.