tensorflow / addons

Useful extra functionality for TensorFlow 2.x maintained by SIG-addons
Apache License 2.0
1.69k stars 611 forks source link

[Discussion] When to utilize tf.function #13

Closed seanpmorgan closed 5 years ago

seanpmorgan commented 5 years ago

Are there any best practices for when to decorate functions with tf.function? The TF2 documentation has a nice example where a keras Model is decorating the call method.

@karmel @martinwicke Should we use this extensively in our repo anytime there is a method without python side-effects? Is it safer to only utilize it when there are clear performance gains by exploiting parallelism or other optimizations? The latter seems like a vague criteria without specifying what patterns to look for.

martinwicke commented 5 years ago

I think as a general rule, we should use it wherever possible unless there's a clear reason not to.

karmel commented 5 years ago

Some reasons not to, off the top of my head:

Barring those, function-ify away. And, try to avoid the above scenarios if possible. (For example, in Keras, we have the object store a "built" attr such that we only create vars on the first call, and subsequent calls don't recreate attrs.)

karmel commented 5 years ago

(To clarify the second point there, Python control flow in general is fine, but there are certain edge cases where, once we trace the tf.function, it will stop being dynamic each time you call the function. For example, if you made a decision based on time or some Python object outside the function and not passed into the function. In those cases, the function would great a bit-of-graph based on the first calling, but would not regenerate with the most up-to-date Python values at each calling. That said, different values of inputs cause the function to be re-traced, so if the dynamic value in question were an input to the tf.function, you would be okay.)

seanpmorgan commented 5 years ago

Thanks! Closing issue as it can be re-opened if there are more thoughts/questions

seanpmorgan commented 5 years ago

So re-opening this issue for a bit just to continue the discussion. Scott has mentioned that hes hesitant to decorate some seq2seq stuff with tf.function for fear of debug-ability loss (See #72 review) .

We have noticed a possible issue in #94 that may be related to the liberal decorating, but I wanted to discuss our strategy a bit. It's certainly possible that in #94 we're seeing just a bug or some improper setup of the function for using autograph... but it does seem to have the potential to make troubleshooting more difficult.

@karmel @martinwicke @facaiy Should we consider letting the user decide when to decorate addon functions? That might make it easier in the short term, but I expect we'll field a lot of issues once users try to do just that.

martinwicke commented 5 years ago

@alextp FYI

Debugging alone shouldn't be grounds for not decorating. for debugging etc. you can use tf.config.experimental_run_functions_eagerly (which temporarily disables running functions as functions).

qlzh727 commented 5 years ago

As the API author, my take on this is that we shouldn't annotation python function with tf.functions by default, instead, we should let user annotation their function and call ours. With that our function will proper inherit the context (eager/graph) based on user's current context.

@alextp, is there any reason to enable tf.function by default from API perspective?

alextp commented 5 years ago

@qlzh727 I think using tf.function in an API implementation, if safe, is a good way to help people get performance by default (both for graph building, as graphs with functions are smaller and quicker to build, and for eager execution).

qlzh727 commented 5 years ago

Ack, one of side effect I am thinking of is that for any mathematical computation related python function, user kind of losses the ability to debug the tensor value if the function is wrapped by tf.function by default.

alextp commented 5 years ago

Which is why we have the execute_functions_eagerly thing.

On Tue, Mar 19, 2019 at 3:43 PM Qianli Scott Zhu notifications@github.com wrote:

Ack, one of side effect I am thinking of is that for any mathematical computation related python function, user kind of losses the ability to debug the tensor value if the function is wrapped by tf.function by default.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/addons/issues/13#issuecomment-474613122, or mute the thread https://github.com/notifications/unsubscribe-auth/AAATxbAGUbJzbwZ6j9bSN-YKGnSKoLYMks5vYWgUgaJpZM4Z_QTu .

--

qlzh727 commented 5 years ago

Oh, do u mean "tf.experimental.function_executor_type"? I didn't find "tf.execute_functions_eagerly" in the API.

alextp commented 5 years ago

tf.config.experimental_run_functions_eagerly

On Wed, Mar 20, 2019 at 7:26 AM Qianli Scott Zhu notifications@github.com wrote:

Oh, do u mean "tf.experimental.function_executor_type"? I didn't find "tf.execute_functions_eagerly" in the API.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/addons/issues/13#issuecomment-474853907, or mute the thread https://github.com/notifications/unsubscribe-auth/AAATxUHv4sBb3ZoksOhcfzxmXL4ss9Q_ks5vYkUVgaJpZM4Z_QTu .

--

qlzh727 commented 5 years ago

Ack, then I think its fine to annotation all the public method that handles tensor/variables with tf.function.

facaiy commented 5 years ago

I think its fine to annotation all the public method that handles tensor/variables with tf.function.

+1 . And I'm wondering how to deal with those private methods? We once discussed it in https://github.com/tensorflow/addons/pull/35 without conclusion.

qlzh727 commented 5 years ago

Good question. My take on this is from performance perspective.

Lets say we have func a, b, c and a -> b -> c (-> here means invoke). If all of them are wrapped with individual tf.function, then all of them will generate its own func_def on the graph level. On the other hand, if only a has tf.function, then a, b and c will be wrapped in one single func_def. Performance wise, I think there is a small penalty for every level of func_def, more over, we have grappler optimizer that will traverse the func_def body and do optimization. A lot of small function might cause it to fail to detect certain cases since those ops are isolated between func_defs.

Having said that all, if we consider another example, you have a -> d, b -> d, and c -> d. Wrapping d with tf.function might actually be a good idea even it is a private method. If a, b and c all calls d with same tensor shape, d will only generate one func_def, which will save a lot of duplication, comparing to only wrap tf.function on a, b and c.

So I don't think there is a golden rule here. @alextp do you have any thoughts?

alextp commented 5 years ago

@qlzh727 No, actually. Using nested tf.functions actually helps graph building time if a function is called more than once in your graph.

Plus at execution time we can in principle (and in practice) inline all function calls and not pay any dispatch overhead.

So I'd rather have a graph with functions in it as it's smaller (and hence faster to build, serialize, etc) and more semantically meaningful.

qlzh727 commented 5 years ago

Ah, I see, thanks Alex. Apparently grappler function_optimizer does all the heavy lifting for inlining the functions, which means my argument of a -> b -> c above is not valid. I think then it is quite safe to annotate all the functions with tf.function if they handle tensor/variable.

@facaiy, You can take a look for https://github.com/tensorflow/tensorflow/blob/e41c382ffc16ed26d10e775e4facd0c641b07234/tensorflow/core/grappler/optimizers/function_optimizer.cc if you are interested.

facaiy commented 5 years ago

Thank you for the input, @qlzh727 @alextp :-)

topinfrassi01 commented 3 years ago

I've read this thread and many other many times but there's something I still don't understand.

Should the call function of custom layers be decorated with the tf.function decorator?

For example :

class CustomLayer(Layer):
    def build(self, input_shape):
        self.w1 = self.add_weight('w1', shape=(input_shape))

    def call(self, inputs):
        return tf.matmul(inputs, tf.transpose(self.w1))

Should the call function be decorated?

Because I looked at examples on the tensorflow.models repo and the call functions aren't decorated (for example : https://github.com/tensorflow/models/blob/master/official/vision/beta/modeling/layers/nn_blocks_3d.py)

khatchad commented 2 years ago

tf.config.experimental_run_functions_eagerly On Wed, Mar 20, 2019 at 7:26 AM Qianli Scott Zhu @.***> wrote: Oh, do u mean "tf.experimental.function_executor_type"? I didn't find "tf.execute_functions_eagerly" in the API. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#13 (comment)>, or mute the thread https://github.com/notifications/unsubscribe-auth/AAATxUHv4sBb3ZoksOhcfzxmXL4ss9Q_ks5vYkUVgaJpZM4Z_QTu . -- - Alex

But, how would users (developers) know to turn that option on? Developers would be burdened with having to know something about the internal API implementation details.