Video summary support - Githubissues

teamdandelion commented 7 years ago

Migrated from https://github.com/tensorflow/tensorflow/issues/3936

Just wondering if there are any plans to add video summaries to tensorboard??

@falcondai offered to implement it as a plugin. @falcondai, I think we are almost at the point where we're ready to accept a plugin like that. Give us a few more weeks to clean up the plugin API and get some good examples you can work off of.

falcondai commented 7 years ago

sounds good. i look forward to that 😄

chrisranderson commented 7 years ago

@dandelionmane @falcondai @jart I'd be willing to build it as part of #130, if @falcondai doesn't mind. I think my project should be split into two pieces - the piece that generates the frames, and the summary piece that streams frames to TensorBoard.

falcondai commented 7 years ago

@chrisranderson i originally planned to implement the video summaries as GIF's: I have experience doing that from a related project https://github.com/tensorflow/models/issues/553. I thought of the alternative of showing the videos as full-blown HTML5-standard videos (i assume that is what you have in mind) but decided it is less desirable. My reasons:

currently, many relevant research (video prediction, etc) is focused on generating very short clips so a looping GIF should suffice
universal support for GIF playback
little load on the tensorboard server (compile a GIF and sending it)?

That said, i can see the merit of a full-blown video playback being useful (in video segmentation and long video prediction?). and the playback controls associated with video formats are useful in these cases. I wonder what other people think.

It might be worth implementing both and let end-users choose whichever it is more appropriate for their usage. what do you think @dandelionmane?

in any case, help is more than welcome!

teamdandelion commented 7 years ago

I have no objection to having a "gif dashboard" plugin and a "streaming video dashboard" plugin. They do seem like they satisfy potentially different usecases, and the more plugins the merrier, imo.

falcondai commented 7 years ago

out of curiosity, how would the tensorboard UI change to accommodate many plugins? i imagine that only showing the ones being used would be great (up to now a few tabs on mine are always empty).

teamdandelion commented 7 years ago

Check out https://github.com/tensorflow/tensorboard/pull/181 from @wchargin! It makes it only display active plugins. We'll still need to change it if people start to have many active plugins at the same time, but we haven't reached that point yet.

In the long term, I've thought about moving the plugin list to the left side, and having it expand/collapse using a hamburger pullout style menu. That way, if the number of plugins is large, you can scroll in the list.

(When the pullout menu is retracted, each plugin would show a small icon representation, so if you remember the icons, you can switch plugins without pulling out the menu, and save horizontal space.)

falcondai commented 7 years ago

sounds good! look forward to these UI changes.

Sorry to digress (further) from the original issue: is mobile support on the feature timeline? i sometimes found myself using TB on my phone on-the-go. Honestly it works okay as is for inspecting a chart or two, but i think just a little CSS-fu responsive design will significantly improve it.

teamdandelion commented 7 years ago

Mobile support isn't on the feature timeline, but if you want to take a stab at CSS-fu, we'll be happy to review the pull request :)

miriaford commented 6 years ago

Is the video summary feature still on the table? I think an important use case is reinforcement learning. For example, being able to see a video of the current Cartpole policy would be very helpful. The video is typically at most 1 minute for many Gym environments.

chrisranderson commented 6 years ago

@miriaford You can, as of https://github.com/tensorflow/tensorboard/pull/613. There's an example on the README here of passing in an arrays parameter, which can be an image: https://github.com/chrisranderson/beholder/. There's also stuff available for recording the video.

dabana commented 6 years ago

I would like to add to @miriaford 's comment. I also think it would be VERY useful for reinforcement learning (RL), just to control if the agent is actually learning something that makes sense. Like @miriaford says, these little GIFs are really short. In my case (playing VizDoom), at most a couple of hundred of stacks. Most often they are also low resolution (84x84) and single channel.

@chrisranderson your tool looks awesome, I will definitely try it out some day. But it looks a little overkill for the "quick and dirty" diagnosis of RL agents. I was wondering. Instead of an additional plugin, wouldn't it be more straightforward to just add GIF support to the already existing and excellent Image tab? What would be the main constrains then?

Many thanks to anyone working on this (these) feature(s)!

danijar commented 6 years ago

@chrisranderson This looks nice. Is there a way to write these summaries from within the graph, something like a tf.summary.video(name, tensor_of_frames)?

alexlee-gk commented 6 years ago

I'm not sure why this hasn't been mentioned before, but I just noticed that TensorBoard's image plugin has supported GIFs all along. What's missing is an out-of-the-box way to add GIF summaries with TF. Currently, it's not possible to do that because it lacks a GIF encoder and tf.summary.image can't take in encoded strings (it only takes in image tensors). One option is to encode the tensor with a third-party library, manually construct a protobuf image summary, and then add that to the summary writer. Here is a self-contained example of how to do that:

import tempfile
import moviepy.editor as mpy
import numpy as np
import tensorflow as tf

def convert_tensor_to_gif_summary(summ):
    if isinstance(summ, bytes):
        summary_proto = tf.Summary()
        summary_proto.ParseFromString(summ)
        summ = summary_proto

    summary = tf.Summary()
    for value in summ.value:
        tag = value.tag
        images_arr = tf.make_ndarray(value.tensor)

        if len(images_arr.shape) == 5:
            # concatenate batch dimension horizontally
            images_arr = np.concatenate(list(images_arr), axis=-2)
        if len(images_arr.shape) != 4:
            raise ValueError('Tensors must be 4-D or 5-D for gif summary.')
        if images_arr.shape[-1] != 3:
            raise ValueError('Tensors must have 3 channels.')

        # encode sequence of images into gif string
        clip = mpy.ImageSequenceClip(list(images_arr), fps=4)
        with tempfile.NamedTemporaryFile() as f:
            filename = f.name + '.gif'
        clip.write_gif(filename, verbose=False)
        with open(filename, 'rb') as f:
            encoded_image_string = f.read()

        image = tf.Summary.Image()
        image.height = images_arr.shape[-3]
        image.width = images_arr.shape[-2]
        image.colorspace = 3  # code for 'RGB'
        image.encoded_image_string = encoded_image_string
        summary.value.add(tag=tag, image=image)
    return summary

sess = tf.Session()
summary_writer = tf.summary.FileWriter('logs/image_summary', graph=tf.get_default_graph())

images_shape = (16, 12, 64, 64, 3)  # batch, time, height, width, channels
images = np.random.randint(256, size=images_shape).astype(np.uint8)
images = tf.convert_to_tensor(images)

tensor_summ = tf.summary.tensor_summary('images_gif', images)
tensor_value = sess.run(tensor_summ)
summary_writer.add_summary(convert_tensor_to_gif_summary(tensor_value), 0)

summ = tf.summary.image("images", images[:, 0])  # first time-step only
value = sess.run(summ)
summary_writer.add_summary(value, 0)

summary_writer.flush()

ankush-me commented 6 years ago

Thanks @alexlee-gk! I have simplified the interface to match the standard summary ops below. Needs to be extended for taking in a batch of GIFS -- should be straight forward (I hope)!

import tempfile
import moviepy.editor as mpy
import os
import os.path as osp
import tensorflow as tf
import numpy as np
from StringIO import StringIO

from tensorflow.python.framework import constant_op 
from tensorflow.python.ops import summary_op_util

def py_encode_gif(im_thwc, tag, fps=4):
  """
  Given a 4D numpy tensor of images, encodes as a gif.
  """
  with tempfile.NamedTemporaryFile() as f: fname = f.name + '.gif'
  clip = mpy.ImageSequenceClip(list(im_thwc), fps=fps)
  clip.write_gif(fname, verbose=False, progress_bar=False)
  with open(fname, 'rb') as f: enc_gif = f.read()
  os.remove(fname)
  # create a tensorflow image summary protobuf:
  thwc = im_thwc.shape
  im_summ = tf.Summary.Image()
  im_summ.height = thwc[1]
  im_summ.width = thwc[2]
  im_summ.colorspace = 3 # fix to 3 == RGB
  im_summ.encoded_image_string = enc_gif
  # create a summary obj:
  summ = tf.Summary()
  summ.value.add(tag=tag, image=im_summ)
  summ_str = summ.SerializeToString()
  return summ_str

def add_gif_summary(name, im_thwc, fps=4, collections=None, family=None):
  """
  IM_THWC: 4D tensor (TxHxWxC) for which GIF is to be generated.
  COLLECTION: collections to which the summary op is to be added.
  """
  if summary_op_util.skip_summary(): return constant_op.constant('')
  with summary_op_util.summary_scope(name, family, values=[im_thwc]) as (tag, scope):
    gif_summ = tf.py_func(py_encode_gif, [im_thwc, tag, fps], tf.string, stateful=False)
    summary_op_util.collect(gif_summ, collections, [tf.GraphKeys.SUMMARIES])
  return gif_summ

alexlee-gk commented 6 years ago

That's neat! For the encoding of the GIF, I'd like to suggest to directly use ffmpeg instead of moviepy by using the encode_gif function I wrote here. It avoids writing to a temporary file by piping outputs directly to an encoded string. More importantly, it uses ffmpeg's palette generation which I have found to work better (in terms of artifacts and time) than the color optimizations available in moviepy.

PHarshali commented 6 years ago

Still waiting for the tensorboard video and very excited. Till then if there is anyone wanting to read about tensorboard can follow this link - https://data-flair.training/blogs/tensorboard-tutorial/

alexlee-gk commented 5 years ago

I also wrote gif summaries for the new summary API. Here are self-contained colabs that uses gif summaries for the original summary API and the summary API v2.

Original summary API: https://colab.research.google.com/drive/1vgD2HML7Cea_z5c3kPBcsHUIxaEVDiIc

Summary API v2: https://colab.research.google.com/drive/1CSOrCK8-iQCZfs3CVchLE42C52M_3Sej

alexlee-gk commented 5 years ago

The original ffmpeg command was dropping some frames for gifs that had more than a certain number of frames. This would only happen under certain circumstances (e.g. only in some machines I tried). The dropping issue is fixed by adding [x]fifo[x] to the filtering part of the ffmpeg command. I have updated the colabs with this fix. Thanks to @kpertsch for pointing and figuring this out!

Reference: https://superuser.com/questions/1323429/how-to-efficiently-create-a-best-palette-gif-from-a-video-portion-straight-from

htung0101 commented 5 years ago

@alexlee-gk Is there a way to use this for tf2, which is in eager mode. I copy-paste your code, but it is not showing up. can we add something like tf.summary at the very end to write the data out.

PeterMitrano commented 5 years ago

The summary api v2 works for me in 1.14 in eager mode

kdbanman commented 4 years ago

This is a simple adaptation of the @alexlee-gk's original code, but for TF2.x. Demo collab:

https://colab.research.google.com/drive/1ut0eJJ3pLjJYrgqVfz2QG69JnKA_2wQO

Code

As earlier in the comments, this uses moviepy to encode the gif as bytes, then builds a v1 summary protobuf with it.

import tempfile
import moviepy.editor as mpy
import os
import tensorflow as tf

def encode_gif_bytes(im_thwc, fps=4):
  with tempfile.NamedTemporaryFile() as f: fname = f.name + '.gif'
  clip = mpy.ImageSequenceClip(list(im_thwc), fps=fps)
  clip.write_gif(fname, verbose=False, progress_bar=False)

  with open(fname, 'rb') as f: enc_gif = f.read()
  os.remove(fname)

  return enc_gif

def gif_summary(im_thwc, fps=4):
  """
  Given a 4D numpy tensor of images (TxHxWxC), encode a gif into a Summary protobuf.
  NOTE: Tensor must be in the range [0, 255] as opposed to the usual small float values.
  """
  # create a tensorflow image summary protobuf:
  thwc = im_thwc.shape
  im_summ = tf.compat.v1.Summary.Image()
  im_summ.height = thwc[1]
  im_summ.width = thwc[2]
  im_summ.colorspace = 3 # fix to 3 for RGB
  im_summ.encoded_image_string = encode_gif_bytes(im_thwc, fps)

  # create a serialized summary obj:
  summ = tf.compat.v1.Summary()
  summ.value.add(image=im_summ)
  return summ.SerializeToString()

Usage

The summary object can be used with the new summary writer as follows:

# Tensorboard boilerplate
from tensorflow import summary

import numpy as np
import datetime

current_time = str(datetime.datetime.now().timestamp())
log_dir = 'logs/tensorboard/' + current_time
summary_writer = summary.create_file_writer(log_dir)

# Here I'll make T=48 frames of greyscale noise.  Any tensor of
# shape TxHxWxC with pixel values in range [0, 255] will work.
gif_tensor = np.random.random((48, 100, 100, 1))
gif_tensor = gif_tensor * 255

# Just call gif_summary and pass it to the summary you imported 
gif = gif_summary(gif_tensor, fps=24)

with summary_writer.as_default():
  # Optionally pass step and name
  summary.experimental.write_raw_pb(gif, step=1, name='wow gifs')

danijar commented 4 years ago

I also wrote a cleaned-up TF 2 version of Alex' GIF summary a few days ago. It works for both eager tensors, Numpy arrays, and with the note below also for graph tensors inside tf.function:

def video_summary(name, video, step=None, fps=20):
  name = tf.constant(name).numpy().decode('utf-8')
  video = np.array(video)
  if video.dtype in (np.float32, np.float64):
    video = np.clip(255 * video, 0, 255).astype(np.uint8)
  B, T, H, W, C = video.shape
  try:
    frames = video.transpose((1, 2, 0, 3, 4)).reshape((T, H, B * W, C))
    summary = tf.compat.v1.Summary()
    image = tf.compat.v1.Summary.Image(height=B * H, width=T * W, colorspace=C)
    image.encoded_image_string = encode_gif(frames, fps)
    summary.value.add(tag=name + '/gif', image=image)
    tf.summary.experimental.write_raw_pb(summary.SerializeToString(), step)
  except (IOError, OSError) as e:
    print('GIF summaries require ffmpeg in $PATH.', e)
    frames = video.transpose((0, 2, 1, 3, 4)).reshape((1, B * H, T * W, C))
    tf.summary.image(name + '/grid', frames, step)

def encode_gif(frames, fps):
  from subprocess import Popen, PIPE
  h, w, c = frames[0].shape
  pxfmt = {1: 'gray', 3: 'rgb24'}[c]
  cmd = ' '.join([
      f'ffmpeg -y -f rawvideo -vcodec rawvideo',
      f'-r {fps:.02f} -s {w}x{h} -pix_fmt {pxfmt} -i - -filter_complex',
      f'[0:v]split[x][z];[z]palettegen[y];[x]fifo[x];[x][y]paletteuse',
      f'-r {fps:.02f} -f gif -'])
  proc = Popen(cmd.split(' '), stdin=PIPE, stdout=PIPE, stderr=PIPE)
  for image in frames:
    proc.stdin.write(image.tostring())
  out, err = proc.communicate()
  if proc.returncode:
    raise IOError('\n'.join([' '.join(cmd), err.decode('utf8')]))
  del proc
  return out

If you want to use it inside tf.function, you need to give it access to the summary writer because it may be executed in a different thread for which your summary writer is not set as default:

@tf.function
def foo():
  # ...
  robust_video_summary(writer, 'name', video)
  # ...

def robust_video_summary(writer, name, video):
  step = tf.summary.experimental.get_step()
  def inner(name, video):
    if step is not None:
      tf.summary.experimental.set_step(step)
    with writer.as_default():
      video_summary(name, video)
  return tf.py_function(inner, args, [])

@alextp What do you think about adding something like this to tf.summary? The above solution falls back to a static image that shows all frames left-to-right in case ffmpeg is not available, so it wouldn't add any required dependencies to TF. I'm not sure if that's good enough.

alextp commented 4 years ago

@nfelt I think we should add this; WDYT?

danijar commented 4 years ago

Any update on this?

wchargin commented 4 years ago

No; we’ll post here if there’s an update.

danijar commented 4 years ago

Okay, thanks! Looking forward to this.

rsandler00 commented 2 years ago

@wchargin @danijar any updates? Would be great to have this integrated!

PawelFaron commented 2 years ago

Doing reinforcement learning, it would be great to have it and be able to see video from last evaluation in tensorboard.

AnimeshSinha1309 commented 1 week ago

I tried encoding to a GIF, following code from tensorboardX: https://github.com/pytorch/pytorch/blob/1152726febfb034b12235fd044345a417a0f1607/torch/utils/tensorboard/summary.py#L652.

However, this uses the raw summary proto from tensorflow 1, and functions in the SummaryWriter which are no longer supported in TF2, and other libraries which import it like Flax. Is there a new recommended way to do this?

danijar commented 1 week ago

After 8 years of waiting, I've given up on TensorBoard and wrote my own with video support: github.com/danijar/scope

tensorflow / tensorboard

Video summary support #39

Code

Usage