tensorflow / tensorboard

TensorFlow's Visualization Toolkit
Apache License 2.0
6.72k stars 1.66k forks source link

Consider re-exporting FileWriter and merge_all from `tensorboard.summary` #569

Closed wchargin closed 5 years ago

wchargin commented 7 years ago

Suggestion to build on top of #561.

Here is a sample TensorBoard program today, on TF==1.3.0 and TB==0.1.7:

import tensorflow as tf
from tensorboard import summary

def main():
  normals = tf.random_normal(shape=[10, 10])
  summary.scalar('mean', tf.reduce_mean(normals))
  summary.histogram('dist', normals)
  summ = tf.summary.merge_all()  # <--

  with tf.Session() as sess:
    writer = tf.summary.FileWriter('/tmp/new_api_demo')  # <--
    writer.add_graph(sess.graph)
    for step in xrange(10):
      writer.add_summary(sess.run(summ), global_step=step)

if __name__ == '__main__':
  main()

On the marked lines, we must use tf.summary instead of tensorboard.summary. This isn't too ugly, but I can imagine it being confusing from beginners. It's easy to explain what the difference between tf.summary and tensorboard.summary is, but ideally we shouldn't have to.

If we export

FileWriter = tf.summary.FileWriter
merge = tf.summary.merge
merge_all = tf.summary.merge_all

in tensorboard.summary, then users could use the tensorboard.summary module as their one-stop shop for all things summary-related, which seems desirable.

I tentatively suggest that we keep the Summary and Event protos in tf.summary itself, but could easily be persuaded otherwise.

open to input @dandelionmane @jart @chihuahua

teamdandelion commented 7 years ago

I'm wholly in support of this. We need the FileWriter to write the scalar_pbs or other endpoints, so TensorBoard can't really be used w/o the TF api until we export these symbols.

Another thing this gives us is much more flexibility for adding new features. E.g. I think we should add additional run-level metadata, like run descriptions and hyperparameters. To write this metadata, we will probably need to reify an explicit "Run" construct which might need a special run-aware FileWriter, or we might add some APIs to the FileWriter to support Runs. It's easier to do that if the APIs are sourced from our codebase rather than the TensorFlow codebase.

I think we probably should have Summary and Event be exposed by TB api as well, but while we're redesigning the backend, we might want to leave ourselves the option of deciding what to export for later, since it's easy to add stuff to the api and hard to take it out. (What if we wanted to define our own Summary and Event protos which extend or diverge from the TF ones? Would that even be feasible?)

wchargin commented 7 years ago

TensorBoard can't really be used w/o the TF api until we export these symbols.

Good point.

I think we probably should have Summary and Event be exposed by TB api as well […] I think we probably should have Summary and Event be exposed by TB api as well

Absolutely; this is the right call.

(What if we wanted to define our own Summary and Event protos which extend or diverge from the TF ones? Would that even be feasible?)

It probably would be feasible. Extending would be easy. If we wanted to diverge, we could add a function translate_event to our existing data_compat module.

(Still happy to hear input from others; if there’s no dissent, I’ll implement this when I have some time.)

chihuahua commented 7 years ago

That sounds exciting. I support as well for those reasons.

Here's the FileWriter, which depends on pywrap_tensorflow.EventsWriter: https://github.com/tensorflow/tensorflow/blob/18f36927160d05b941c056f10dc7f9aecaa05e23/tensorflow/python/summary/writer/writer.py

We can either keep pywrap_tensorflow.EventsWriter within tensorflow but expose it or move it (and its C++ code) into TensorBoard. We could also just expose tf.EventFileWriter.

wchargin commented 7 years ago

What are the advantages of exposing pywrap_tensorflow.EventsWriter or tf.EventFileWriter as opposed to just using either

FileWriter = tf.summary.FileWriter

or

class FileWriter(tf.summary.FileWriter):
  pass

?

See also https://github.com/tensorflow/tensorboard/pull/192/files#diff-f99798d446c4151e6968aa8d58ed9dd8R23.

teamdandelion commented 7 years ago

The approach I took in the linked PR is nice, I think, in that it ensures API documentation is available in this repository (someone using TB without TF should not need to go to the tf project to get documentation on TB's apis).

It also gives us good flexibility if we want to extend / modify behavior later.

chihuahua commented 7 years ago

@jart - curious about your thoughts on FileWriter.

jart commented 7 years ago

I wouldn't be opposed to doing this.

We haven't set up a documentation website yet. But it would be nice to know if it would be smart enough to pull in the docstrings from TensorFlow's codebase, and then pretend that the class is defined in this codebase. Or if we'd have to write a shell method where the docstring basically gives them a link to https://www.tensorflow.org/api_docs/python/tf/summary/FileWriter to learn more.

Also note that we're probably going to have a TensorBaseWriter or something similar in the future, to send the events straight to the DB. That will probably go in this tensorboard.summary module.

wchargin commented 5 years ago

No longer needed as of TF 2.0.