cerndb / dist-keras

Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
http://joerihermans.com/work/distributed-keras/
GNU General Public License v3.0
624 stars 169 forks source link

How can I visualise performance change over time? #33

Closed lnicalo closed 7 years ago

lnicalo commented 7 years ago

I do not see any way to show the performance per batch / epoch. I would like to use callback functions that there are available in keras. Is it possible with the current version?

JoeriHermans commented 7 years ago

Hi @lnicalo,

In this package you don't use Keras call-backs, since it uses a custom parameter server to monitor the central variable. Furthermore, the parameter server doesn't use a Keras model, it only keeps track of the parameters (which is a Numpy array). However, you will be able to obtain the training accuracy of the center variable over time using the following code.

def compute_plot_metrics(history):
    bin_delta = 5.0
    origin_time = float("inf")
    for h in history:
        t = h['timestamp']
        if t < origin_time:
            origin_time = t
    # Normalize wrt orgin time.
    for h in history:
        h['timestamp'] -= origin_time
        max_time = float("-inf")
    # Compute max time.
    for h in history:
        t = h['timestamp']
        if t > max_time:
            max_time = t
    # Computed binned data.
    x = []
    y = []
    error = []
    for i in range(0, int(max_time + 1)):
        start = float(i)
        d = [h for h in history if h['timestamp'] >= start and h['timestamp'] < (start + bin_delta)]
        if len(d) > 0:
            x.append(i)
            avg_loss = average_loss(d)
            avg_accuracy = average_accuracy(d)
            std_a = std_accuracy(d)
            std_l = std_loss(d)
            y.append(avg_accuracy)
            error.append(std_a)
    # Convert lists to Numpy arrays.
    x = np.asarray(x)
    y = np.asarray(y)
    error = np.asarray(error)

    return x, y, error

def average_loss(x):
    loss = 0.0
    n = float(len(x))

    for h in x:
        loss += h['history'][0]

    return loss / n

def average_accuracy(x):
    accuracy = 0.0
    n = float(len(x))

    for h in x:
        accuracy += h['history'][1]

    return accuracy / n

def std_accuracy(x):
    a = [a['history'][1] for a in x]
    a = np.asarray(a)

    return np.std(a)

def std_loss(x):
    a = [a['history'][0] for a in x]
    a = np.asarray(a)

    return np.std(a)

# Assume the following optimizer (can be different).
optimizer = ADAG(keras_model=model, worker_optimizer='adam', loss='categorical_crossentropy',
                              num_workers=num_workers, batch_size=128,
                             communication_window=communication_frequency, num_epoch=40,
                             features_col="features_normalized_dense", label_col="label_encoded")
# Collect the training data, and train the model.
trained_model = optimizer.train(training_set)
history = optimizer.get_history()
x, y, error = compute_plot_metrics(history)
# Do the plot.
title = "Optimizer Training Accuracy\n"
handles = []
p, = pl.plot(x, y, label='Your Optimizer')
pl.fill_between(x, y - error, y + error, alpha=0.5)
handles.append(p)
fig = matplotlib.pyplot.gcf()
fig.set_dpi(200)
pl.grid(True)
pl.xlim([-.1, 1200 + 0])
pl.ylim([.7, 1])
pl.xlabel("Seconds")
pl.ylabel("Training Accuracy")
pl.title(title)
pl.legend(handles=handles)
pl.show()

EDIT: Updated missing utility methods.

After this, the resulting plot should look something like this:

AGN / ADAG

I hope this helps.

Joeri

lnicalo commented 7 years ago

Thank you for your quick answer. I see you are using a set of functions like: average_loss

avg_loss = average_loss(d)
avg_accuracy = average_accuracy(d)
std_a = std_accuracy(d)
std_l = std_loss(d) 

Where could I import these functions?

JoeriHermans commented 7 years ago

Hi @lnicalo

Sorry, I forget to include the utility methods as well, I added them to the code preview above.

Joeri