tensorflow / tensorboard

TensorFlow's Visualization Toolkit
Apache License 2.0
6.72k stars 1.66k forks source link

Agent - Reinforcement Learning plugin #1659

Open andrewschreiber opened 5 years ago

andrewschreiber commented 5 years ago

Hello! A friend and I prototyped a Tensorboard plugin called Agent for visualizing deep reinforcement learning algorithms. Agent is focused on the time-step level - enabling you to step frame-by-frame through an episode with supporting visualizations.

Chris Anderson of Beholder recommended I post here for advice because the Tensorboard team are “super nice folks in my experience and happy to help.” 😄

Demo GIF Code/details: https://github.com/andrewschreiber/agent

Agent received constructive feedback from a few Deep RL researchers on Agent’s usefulness for interpretability/explainability work and I was given a grant to work full-time on developing v1 over the next three months. I’m really curious for your feedback on the project, whether it fits in the Tensorboard vision, and ideas for improvements.

A few additional questions:

  1. Currently Agent writes metadata to JSON files, should I move to protobufs?
  2. Is there any upper bound of how much disk space a plugin can/should use within the plugin folder? Users of Agent may save gigabytes worth of rollout images and model files, which in theory I could write to a .agentlogs root directory folder.
  3. Agent moves images from the python runtime to javascript runtime by encoding the Numpy array as a PNG, converting to base64, and inserting the string into a JSON blob. The JSON is received on a route and the image string is deserialized and passed into an element. I think encoding as binaries would be more efficient, though it seems the internal web server doesn’t support binaries. Would you recommend I implement that or take a different approach?
  4. One requested feature is user-provided visualizations as cards. Seems tricky and I’m unsure if it’s possible. Is it feasible (given Bazel) to load a new python file at runtime? How might additional dependencies be defined and installed? My sketch architecture is to define a superclass with mandatory overridable methods like visualizationImage(input, timestep, model, ops). One could make a custom visualization by subclassing in an foo_vis.py file, add the filepath to a section on the sidebar, and see the new visualization as another cell. Agent would handle importing the file, passing the relevant parameters, and rendering the image output.
  5. Should I move Agent to it’s own repo instead of living as a fork?
stephanwlee commented 5 years ago

Hi @andrewschreiber, thanks for sharking the demo and preview of your work! It looks amazing!

Our team will discuss some of these questions in detail and will get back to you promptly.

In the meanwhile, off of my head:

  1. It is not imposed by TensorBoard but may be imposed by the machine/VM/environment a user is running. For instance, we cannot always assume disk-full environment. How much of the disk did you use for the live demo? What is an assumed magnitude of size?
  2. I believe there is a limit to base64 serialized string in some browsers. @wchargin would know more about that. Setting the length limitation aside, although I did not measure, I would, too, guess PNGs would gzip better than base64 serialized string. I will look more into it but cannot imagine why werkzeug would not support binary response. 4-5: our team is contemplating how plugins from external contribution look like. More specifically, we would like to have dynamically loaded plugins but details are not hashed out yet. Depending on the design, I imagine answer to 4-5 would change.
andrewschreiber commented 5 years ago

Thank you!

On: 2: Okay if Tensorboard doesn't have a limit, I think it'll be okay then. It turns out each rendered frame is 4kb in Atari when saved as an individual PNG. I think with optimization (e.g. saving an entire episode as a compressed .npz file), the size could be further reduced. The demo used 75mb of disk for 4 episodes. 3: Let me know what you find on binary responses. I may have missed something. 4-5: Understood, staying tuned.