chrisranderson / beholder

A TensorBoard plugin for visualizing arbitrary tensors in a video as your network trains.
461 stars 20 forks source link

Change Python to Python communication to use pretty boss protocol buffers #29

Open chrisranderson opened 7 years ago

jart commented 7 years ago

Consider using protobuf? That's what TensorFlow is standardized on.

chrisranderson commented 7 years ago

Huh. I didn't even consider it. I guess protobuf isn't part of my dev vocab yet. Any other reason besides consistency with TensorFlow? If it's faster/smaller (I'm guessing yes, but I don't know), that could be nice . Since I'm writing stuff to disk so often, I have issues with things being half-written (or something) when they're read. I also wonder if I'm going to have worse problems on machines that don't use a SSD.

jart commented 7 years ago

It's pretty boss, trust me. Google uses it at all layers of the infrastructure. It's the best thing since ASN.1.

chrisranderson commented 7 years ago

Haha. Cool. Well I wouldn't mind learning it, so maybe after things settle down a little bit I'll check it out. Thanks for the recommendation (and for looking through issues or however you found this!).

jart commented 7 years ago

We're subscribed to notifications on your repository, because we want to support your efforts. If you're working hard to make TensorBoard better, then we want to work hard to help you be successful.

chrisranderson commented 7 years ago

@jart, It looks like for what I need, pickles are faster. Did I design my experiment poorly? I got

protobuf time: 0.6017861366271973 pickle time: 0.5259068012237549

protoc --version: libprotoc 2.6.1

Here's config.proto:

syntax = "proto2";

package beholder;

message Config {
  enum Values {
    trainable_variables = 0;
    arrays = 1;
    frames = 2;
  }

  enum Mode {
    current = 0;
    variance = 1;
  }

  enum Scaling {
    layer = 0;
    network = 1;
  }

  required Values values = 1;
  required Mode mode = 2;
  required Scaling scaling = 3;
  required int32 window_size = 4;
  required int32 FPS = 5;
  required bool is_recording = 6;
}

And my experiment:

import pickle
import time

import config_pb2

#######################################

protobuf_start = time.time()

for i in range(10000):
  write_config = config_pb2.Config()

  with open('config.protobuf', 'wb') as file:
    write_config.values = write_config.trainable_variables
    write_config.mode = write_config.current
    write_config.scaling = write_config.network
    write_config.window_size = 10
    write_config.FPS = 20
    write_config.is_recording = False
    file.write(write_config.SerializeToString())

  read_config = config_pb2.Config()
  with open('config.protobuf', 'rb') as file:
    x = read_config.ParseFromString(file.read())

protobuf_end = time.time()

#######################################

pickle_start = time.time()

for i in range(10000):
  write_config = {
    'values': 'trainable_variables',
    'mode': 'current',
    'scaling': 'network',
    'window_size': 10,
    'FPS': 20,
    'is_recording': False,
  }

  with open('config.pkl', 'wb') as file:
    pickle.dump(write_config, file)

  with open('config.pkl', 'rb') as file:
    x = pickle.load(file)

pickle_end = time.time()

#######################################

print('protobuf time:', protobuf_end - protobuf_start)
print('pickle time:', pickle_end - pickle_start)

# protobuf time: 0.6017861366271973
# pickle time: 0.5259068012237549