Open stephanwlee opened 5 years ago
So in both the cases graph does not appear??
Latter is a bug of TensorFlow's and former is working as intended; if your code is purely eager, there is absolutely no computational graphs and thus the graph plugin cannot do anything about it.
@stephanwlee autograph can be used to bridge the gap between eager execution and graphs. It can be used to convert the eager-style python code to graph-generating code.
So to clarify, if you use eager only code, you can not debug with tensorboard graphs? I'm confused as to how the keras writer does this. Maybe that is a tf.function under the hood now.
@cottrell Yup correct. Pure eager builds no graph. Keras always has built some graph underneath (the details escape me at the moment but IIRC, in v1, it built a graph using its own session and, in v2, it constructs graph with FuncGraph) and is able to show some graph unless you pass run_eagerly
to its compile method.
I have the same problem
When using a model which inherits from tensorflow.python.keras.Model
, there is a graph which can be visualized! This even works with run_eagerly=True
. The trick is to save the graph at the right moment. Such a moment occurs, when keras/tensorflow builds the model, because it invokes model.call in non-eager mode (not exactly sure why). The resulting tensor holds the model graph (without losses afaik), which can be written using the "old" FileWriter
. Example code that works for me with Tensorflow 1.13.1 and enabled eager execution:
import tensorflow
# Only use tensorflow's keras!
from tensorflow.python import keras as tfkeras
from tensorflow.python.training.rmsprop import RMSPropOptimizer
import numpy as np
tensorflow.enable_eager_execution()
class MyModel(tfkeras.Model):
def __init__(self, tensorboard_folder_path):
super(MyModel, self).__init__()
self.dense1 = tfkeras.layers.LSTM(units=6)
self.dense2 = tfkeras.layers.Dense(units=4)
self.graph_has_been_written = False
self.tensorboard_folder_path = tensorboard_folder_path
def call(self, input, **kwargs):
print("input shape", input.shape)
result = self.dense1(input)
result = self.dense2(result)
if not tensorflow.executing_eagerly() and not self.graph_has_been_written:
# In non eager mode and a graph is available which can be written to Tensorboard using the "old" FileWriter:
model_graph = result.graph
writer = tensorflow.summary.FileWriter(logdir=self.tensorboard_folder_path, graph=model_graph)
writer.flush()
self.graph_has_been_written = True
print("Wrote eager graph to", self.tensorboard_folder_path)
return result
if __name__ == "__main__":
print("Eager execution:", tensorflow.executing_eagerly())
# Create model and specify tensorboard folder:
model = MyModel("/home/your_username/tensorboardtest/")
optimizer = RMSPropOptimizer(learning_rate=0.001)
model.compile(optimizer, tensorflow.losses.softmax_cross_entropy, run_eagerly=True)
# Build the model (this will invoke model.call in non-eager mode). If model.build is not called explicitly here, it
# will be called by model.fit_generator implicitly when the first batch is about to be feed to the network.
model.build((None, None, 5))
# Can only be called after the model has been built:
model.summary()
# Two arbitrary batches with different batch size and different sequence length:
x1 = np.array([[[1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5]]],
dtype=np.float32)
y1 = np.array([[1, 0, 0, 0]], dtype=np.float32)
print("x1 shape", x1.shape)
print("y1 shape", y1.shape)
x2 = np.array([[[1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5]],
[[1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5]]], dtype=np.float32)
y2 = np.array([[1, 0, 0, 0], [1, 0, 0, 0]], dtype=np.float32)
print("x2 shape", x2.shape)
print("y2 shape", y2.shape)
# Simply yield the two batches alternately
def iterator():
switcher = False
while 1:
if switcher:
yield x1, y1
else:
yield x2, y2
switcher = not switcher
model.fit_generator(iterator(), steps_per_epoch=10, epochs=1)
@simonschroder Unfortunately, this doesn't seem to work with tf 2.0 anymore. The use of FileWriter seems to have changed. Any ideas?
I have the same issues too, i followed the code shows in https://tensorflow.org/tensorboard/r2/graphs the most below method.
# The function to be traced. @tf.function def my_func(x, y): # A simple hand-rolled layer. return tf.nn.relu(tf.matmul(x, y)) # Set up logging. stamp = datetime.now().strftime("%Y%m%d-%H%M%S") logdir = 'logs/func/%s' % stamp writer = tf.summary.create_file_writer(logdir) # Sample data for your function. x = tf.random.uniform((3, 3)) y = tf.random.uniform((3, 3)) # Bracket the function call with # tf.summary.trace_on() and tf.summary.trace_export(). tf.summary.trace_on(graph=True, profiler=True) # Call only one tf.function when tracing. z = my_func(x, y) with writer.as_default(): tf.summary.trace_export( name="my_func_trace", step=0, profiler_outdir=logdir)
in the tutorials, it shows the graphs, however when i ran this code in my environment it doesn't shows the graph in the tensorboard
I found out that one has to pass an input shape to the first layer of the model, so that keras is able to build the graph in advance. Otherwise, the tensorboard callback is not able to draw the graph because it lacks the predefined graph information from keras. I hope that I could help somebody with this information :)
@luvwinnie I just tried your example and the graph was showing up correctly.
Especially because you put a timestamp onto the name of the logdir, I cannot imagine why you'd face a bug. Can you first try to reproduce this in very small minimal example? Thanks!
I have made some incorrect statements about TF 2 and Keras so I would like to clear it up.
it built a graph using its own session and, in v2, it constructs graph with FuncGraph) and is able to show some graph unless you pass run_eagerly to its compile method.
Keras, in TF 2, even before calling compile or build, since layers are built with TensorFlow ops, can technically display a graph.
import tensorflow as tf
tensor = tf.keras.layers.Dense(32)(tf.keras.Input(32))
# https://github.com/tensorflow/tensorflow/blob/81bb13c6363f15581163976d298165e5bcff0588/tensorflow/python/framework/ops.py#L371-L374
tensor.graph # FuncGraph
tensor.graph.as_graph_def() # GraphDef
However, although I lack deep knowledge in this, it seemed like Keras makes few mutations to the graph and this is one of the reason why Keras implements [custom graph execution].(https://github.com/tensorflow/tensorflow/blob/81bb13c6363f15581163976d298165e5bcff0588/tensorflow/python/keras/backend.py#L3366).
You may be able to argue that some Keras subclass model can force Keras to spit out a graph but this statement is not generally true. Moreover, when user compiles Keras model to run eagerly, showing a graph will not help anyone as it can misrepresent how Keras will actually execute the code.
Especially because you put a timestamp onto the name of the logdir, I cannot imagine why you'd face a bug. Can you first try to reproduce this in very small minimal example? Thanks!
I tried this example with TF2 from today's master branch and tensorboard 1.13.1. Here, tensorboard cannot find the exported graph...
when following this https://www.tensorflow.org/tensorboard/r2/graphs with @tf.function it shows a "Graphs" -tab in tensorboard, but the loading bar only runs till the half and then stops loading, so there is no graph to see ?
I am using colab with
!pip3 install tensorflow-gpu==2.0.0-alpha0
!pip3 install tb-nightly
H everybody, I think I have a different version of this bug :
import tensorflow as tf
writer = tf.summary.create_file_writer('./')
# @tf.function
def foo(v):
v.assign(tf.math.sigmoid(v))
v = tf.Variable([1, 2], dtype=tf.float32)
tf.summary.trace_on(graph=True)
foo_g = tf.autograph.to_graph(foo)
foo_g(v)
with writer.as_default():
tf.summary.trace_export(
name='test',
step=0
)
Do you think I should create an issue in the TF repository ?
Hi @morgangiraud, I didn't know too much about tf.autograph
but, in TF 2.0, it seems like you should be using @tf.function
instead. As you can see in autograph README, the autograph.to_graph
only gives you a graph but does not really change the TensorFlow execution mode -- the resulting graph code is intended to be used with tf.Graph
and tf.Session
. In other words, in your code snippet above, you are effectively executing graph code in pure Eager and thus the trace is not useful.
Hi @stephanwlee,
Thanks for your answer and your insights, I've been able to change my code a little bit and make tit work. 👍🏻
Just in case other people stumble upon this, the following is a small script that highlights one of the differences between autograph.to_graph
and tf.function
import os
import tensorflow as tf
file_dir = os.path.realpath(os.path.dirname(__file__))
writer = tf.summary.create_file_writer(file_dir)
print('1, executing_eagerly: {}'.format(tf.executing_eagerly()))
with tf.Graph().as_default():
print('2, executing_eagerly: {}'.format(tf.executing_eagerly()))
# @tf.function
def foo(v, name=''):
tf.print(name + ', executing_eagerly: {}'.format(tf.executing_eagerly()))
v.assign(tf.math.sigmoid(v))
v = tf.Variable([1, 2], dtype=tf.float32)
foo(v, 'foo')
tf.summary.trace_on(graph=True)
foo_to_g = tf.autograph.to_graph(foo)
foo_to_g(v, tf.constant('foo_to_g', dtype=tf.string))
foo_tf_func = tf.function(foo)
foo_tf_func(v, tf.constant('foo_tf_func', dtype=tf.string))
with writer.as_default():
tf.summary.trace_export(
name='tf2_graph',
step=0
)
Which outputs
1, executing_eagerly: True
2, executing_eagerly: False
foo, executing_eagerly: True
foo_to_g, executing_eagerly: True
foo_tf_func, executing_eagerly: False
I customized my own layers by subclassing tf.keras.layers.Layers. Then i used Sequential to define my model, but it couldn't be displayed in tensorboard. The error is "Graph does not appear on TensorBoard". Other information all displayed properly. I fixed it by implement a get_config method in my layer. custom layers to be serializable as part of a Functional model
@luvwinnie I just tried your example and the graph was showing up correctly.
Especially because you put a timestamp onto the name of the logdir, I cannot imagine why you'd face a bug. Can you first try to reproduce this in very small minimal example? Thanks! import tensorflow as tf log = './board' with tf.compat.v1.Session() as sess: a = tf.constant(3.0 ,name='first_var') b = tf.constant(4.0,name='sec_var') c = a+b
writer = tf.summary.FileWriter(log,sess.graph)
writer = tf.compat.v1.summary.FileWriter(log, sess.graph) print(sess.run(c)) sess.close()
I recently stumbled upon this issue using the TensorBoard
callback with tf 2.2.0rc2 and tboard 2.2.0. I have no reason why it was happening but I didn't look too far and just reverted back to tf 2.1 and tboard 2.1.1.
Some details are that I am using a subclassed model and complex-valued data.
I am running into a graph not appearing on TensorBoard after wrapping one of the tf.keras.applications
models (Namely MobileNet) with tf.function
and then following the suggested code snippets.
import tensorflow as tf
from datetime import datetime
model = tf.keras.applications.MobileNet()
# The function to be traced.
@tf.function()
def optimized_model(x):
return model(x)
# Set up logging.
stamp = datetime.now().strftime("%Y%m%d-%H%M%S")
logdir = 'logs\\func\\%s' % stamp
writer = tf.summary.create_file_writer(logdir)
# Sample data for your function, ImageNet standard size
x = tf.random.uniform((1, 224, 224,3))
# Bracket the function call with
# tf.summary.trace_on() and tf.summary.trace_export().
tf.summary.trace_on(graph=True, profiler=True)
# Call only one tf.function when tracing.
z = optimized_model(x)
with writer.as_default():
tf.summary.trace_export(name="my_func_trace",step=0,profiler_outdir=logdir)
Does anyone have an idea on why does this happen?
_FYI, I am running tensorflow==2.1.0
and tensorboard==2.1.1
(latest stable builds installed with pip install tensorflow
), and tf.executing_eagerly()
is certainly False
for the optimized_model
function._
When using a model which inherits from
tensorflow.python.keras.Model
, there is a graph which can be visualized! This even works withrun_eagerly=True
. The trick is to save the graph at the right moment. Such a moment occurs, when keras/tensorflow builds the model, because it invokes model.call in non-eager mode (not exactly sure why). The resulting tensor holds the model graph (without losses afaik), which can be written using the "old"FileWriter
. Example code that works for me with Tensorflow 1.13.1 and enabled eager execution:import tensorflow # Only use tensorflow's keras! from tensorflow.python import keras as tfkeras from tensorflow.python.training.rmsprop import RMSPropOptimizer import numpy as np tensorflow.enable_eager_execution() class MyModel(tfkeras.Model): def __init__(self, tensorboard_folder_path): super(MyModel, self).__init__() self.dense1 = tfkeras.layers.LSTM(units=6) self.dense2 = tfkeras.layers.Dense(units=4) self.graph_has_been_written = False self.tensorboard_folder_path = tensorboard_folder_path def call(self, input, **kwargs): print("input shape", input.shape) result = self.dense1(input) result = self.dense2(result) if not tensorflow.executing_eagerly() and not self.graph_has_been_written: # In non eager mode and a graph is available which can be written to Tensorboard using the "old" FileWriter: model_graph = result.graph writer = tensorflow.summary.FileWriter(logdir=self.tensorboard_folder_path, graph=model_graph) writer.flush() self.graph_has_been_written = True print("Wrote eager graph to", self.tensorboard_folder_path) return result if __name__ == "__main__": print("Eager execution:", tensorflow.executing_eagerly()) # Create model and specify tensorboard folder: model = MyModel("/home/your_username/tensorboardtest/") optimizer = RMSPropOptimizer(learning_rate=0.001) model.compile(optimizer, tensorflow.losses.softmax_cross_entropy, run_eagerly=True) # Build the model (this will invoke model.call in non-eager mode). If model.build is not called explicitly here, it # will be called by model.fit_generator implicitly when the first batch is about to be feed to the network. model.build((None, None, 5)) # Can only be called after the model has been built: model.summary() # Two arbitrary batches with different batch size and different sequence length: x1 = np.array([[[1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5]]], dtype=np.float32) y1 = np.array([[1, 0, 0, 0]], dtype=np.float32) print("x1 shape", x1.shape) print("y1 shape", y1.shape) x2 = np.array([[[1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5]], [[1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5]]], dtype=np.float32) y2 = np.array([[1, 0, 0, 0], [1, 0, 0, 0]], dtype=np.float32) print("x2 shape", x2.shape) print("y2 shape", y2.shape) # Simply yield the two batches alternately def iterator(): switcher = False while 1: if switcher: yield x1, y1 else: yield x2, y2 switcher = not switcher model.fit_generator(iterator(), steps_per_epoch=10, epochs=1)
Your code is not written in tf 2.0, in which summary.FileWriter() dose not exist. so there is still not a way to export graph into tensorboard.
For me, the issue with a Sequential
model and custom RNNCell
impls was that I haven't called model.build(shape)
, assuming this will be done when calling model.fit(...)
. Once called model.build()
explicitly before training, the graph was visualized:
rnn_cell = MyCell(...)
layer = tf.keras.layers.RNN(rnn_cell, name="rnn_wrapper")
model = tf.keras.Sequential()
model.add(layer)
model.compile(...)
model.build(input_shape=(...)) # when not invoking `model.build()` now, no graph can be loaded
model.fit(...)
/cc @MarkDaoust This bug it is very annoying and it is confusing new users especially cause we still have the same example in the official documentation on the TF website and that this was submitted in 2019.
/cc @mdanatg Can you help me to find an internal owner for this? it is very annoying that the official doc to profile tf.function is still broken.
Thanks.
The owner for this is the tensorboard team. Neither Dan or I are part of that team. There's a lot going on in this issue spanning years. When I run the notebook, I see graphs where I expect to see graphs. What problem are we trying to fix exactly?
At one point I tried to convince the TensorBoard team that often users have a tf.function that they already traced to make a graph. And in those cases tf.summary.graph
was a much clearer/unambiguous UI than trace_on
/trace_off
/trace_export
. But I did not make any progress there.
I'll see if I can find someone to take a look.
@bileschi may be able to assign someone.
When I run the notebook, I see graphs where I expect to see graphs. What problem are we trying to fix exactly?
So can you see the graph only with this part of the Notebook?
https://www.tensorflow.org/tensorboard/graphs#graphs_of_tffunctions
# The function to be traced.
@tf.function
def my_func(x, y):
# A simple hand-rolled layer.
return tf.nn.relu(tf.matmul(x, y))
# Set up logging.
stamp = datetime.now().strftime("%Y%m%d-%H%M%S")
logdir = 'logs/func/%s' % stamp
writer = tf.summary.create_file_writer(logdir)
# Sample data for your function.
x = tf.random.uniform((3, 3))
y = tf.random.uniform((3, 3))
# Bracket the function call with
# tf.summary.trace_on() and tf.summary.trace_export().
tf.summary.trace_on(graph=True, profiler=True)
# Call only one tf.function when tracing.
z = my_func(x, y)
with writer.as_default():
tf.summary.trace_export(
name="my_func_trace",
step=0,
profiler_outdir=logdir)
I just did a Runtime > "disconnect and delete runtime", then I ran the load_ext, the import, that cell, and the %tensorboard and it gives me a graph.
And in those cases tf.summary.graph was a much clearer/unambiguous UI than trace_on/trace_off/trace_export.
What I see in master: https://github.com/tensorflow/tensorflow/blob/v2.9.1/tensorflow/python/ops/summary_ops_v2.py#L1011-L1032
@tf_export("summary.graph", v1=[])
def graph(graph_data):
"""Writes a TensorFlow graph summary.
Write an instance of `tf.Graph` or `tf.compat.v1.GraphDef` as summary only
in an eager mode. Please prefer to use the trace APIs (`tf.summary.trace_on`,
`tf.summary.trace_off`, and `tf.summary.trace_export`) when using
`tf.function` which can automatically collect and record graphs from
executions.
Usage Example:
```py
writer = tf.summary.create_file_writer("/tmp/mylogs")
@tf.function
def f():
x = constant_op.constant(2)
y = constant_op.constant(3)
return x**y
with writer.as_default():
tf.summary.graph(f.get_concrete_function().graph)
# Another example: in a very rare use case, when you are dealing with a TF v1
# graph.
graph = tf.Graph()
with graph.as_default():
c = tf.constant(30.0)
with writer.as_default():
tf.summary.graph(graph)
"""
Using the official Notebook I still see many 2020 deprecations with TF 2.9:
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/summary_ops_v2.py:1307: start (from tensorflow.python.eager.profiler) is deprecated and will be removed after 2020-07-01.
Instructions for updating:
use `tf.profiler.experimental.start` instead.
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/summary_ops_v2.py:1358: stop (from tensorflow.python.eager.profiler) is deprecated and will be removed after 2020-07-01.
Instructions for updating:
use `tf.profiler.experimental.stop` instead.
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/summary_ops_v2.py:1358: save (from tensorflow.python.eager.profiler) is deprecated and will be removed after 2020-07-01.
Instructions for updating:
`tf.python.eager.profiler` has deprecated, use `tf.profiler` instead.
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/profiler.py:150: maybe_create_event_file (from tensorflow.python.eager.profiler) is deprecated and will be removed after 2020-07-01.
Instructions for updating:
`tf.python.eager.profiler` has deprecated, use `tf.profiler` instead.
Also I was not able to get the overview page when profiling a tf.function: https://www.tensorflow.org/guide/profiler#overview_page
@MarkDaoust Adapting instead what it is suggested in TF at https://www.tensorflow.org/guide/intro_to_modules#saving_functions
I don't have 2020 deprecation warnings but the profile isn't available as it seems to work only with Model: https://colab.research.google.com/gist/bhack/e26fecfb7d04c3005f3453e2c924a249/untitled126.ipynb
Hi @bhack,
Tried to reproduce in a colab, and I can see the graph despite the same warning "tf.python.eager.profiler
has deprecated, use tf.profiler
instead":
Check my previous comment/colab.
Can you see the profile menu and the overview page?
Hi @bhack,
Sounds like the issue is with the profiler, which is out of TensorBoard scope. Can you please file the question here: https://github.com/tensorflow/profiler? Thanks!
Is the profiler still there or in the Tensorflow repo? https://github.com/tensorflow/tensorflow/tree/master/tensorflow/core/profiler
Hi @bhack,
As far as I know the profiler is still visualized on TensorBoard profiler plugin, it's just maintained by a different group of the people.
Ok now we have a ticket at https://github.com/tensorflow/profiler/issues/503. I hope that we don't need to do too much forth and back and that for the ownership between repos :smile_cat:
Hi @bhack,
I took another look at the profiler API doc, I think the confusion here is that the profiler option in the Graph is not the same as the profile plugin in the menu. I can see that the profiler option is enabled in the Graph plugin in your example:
And I think what you actually wanted to use is the profiler plugin/dashboard, but no relevant data that would activate this plugin is currently logged. Please take a look at the example here: https://colab.sandbox.google.com/github/tensorflow/tensorboard/blob/master/docs/tensorboard_profiling_keras.ipynb
And I think what you actually wanted to use is the profiler plugin/dashboard, but no relevant data that would activate this plugin is currently logged. Please take a look at the example here: https://colab.sandbox.google.com/github/tensorflow/tensorboard/blob/master/docs/tensorboard_profiling_keras.ipynb
As you can see in many posts in this thread it was always iworking correctly with Keras model/callback (as you your posted Colab).
The issue here is that we don't see the Tensorflow profiler plugin/dashboard when profiling a tf.function
.
So now, what this the right repository owner of this issue?
I would wait to let the profiler folks to take a look and provide some pointers about how to properly log the profiling information without Keras callback.
A graph may not appear if TensorFlow could not trace any graph during the execution. For instance, below code does not use
@tf.function
and, because it is pure eager, there is no graph TensorFlow used.Other known issues
foo([1,2], [3,4]) # this is not traced. tf.summary.trace_on() foo([1,2], [3,4]) with writer.as_default(): tf.summary.trace_export()