tensorflow / tensorboard

TensorFlow's Visualization Toolkit
Apache License 2.0
6.62k stars 1.64k forks source link

Discretization and smoothing #6829

Open ivanstepanovftw opened 3 months ago

ivanstepanovftw commented 3 months ago

When there is a lot of steps recorded, tensorboard uses discretization for optimization. It is unexpected loss of information, but it's fine. However, when using EMA smoothing, differently discretized time series produces different smoothing result.

screenshot of tensorboard showing irregular smoothing for different scalar time series

Related #5870.

arcra commented 3 months ago

Thank you for the feedback, Ivan.

I'd like to ask for a few clarifications:

ivanstepanovftw commented 3 months ago
  • When you refer to "EMA smoothing", are you referring to the existing smoothing feature in TB? Or a different one?

Existing smoothing.

  • If you're referring to a different feature, is the screenshot you provided a screenshot of the current behavior, or the desired behavior? How did you produce that screenshot if it's the desired behavior?

It is current behavior.

  • Can you elaborate on what the issue is? Are you saying that results are not good (as in, not representative of the real data)? Is it that smoothing is not useful/trustworthy because it produces variable results? Or else, can you describe what the issue is more explicitly?

When you comparing a lot of different runs, runs lines are overlapping each other. What you can do is to zoom in, or use smoothing. When you use smoothing, you expect that it will be work similarly for each run. But runs are different it terms of their time length, because some runs are not yet finished, while other runs have been finished (because of early stopping criteria, etc.)

This issue can be solved by a. removing discretization for long runs, which is simpler but it probably affects performance, or b. time-weighted EMA instead of current smoothing implementation in TB, which is I guess is an EMA.

ivanstepanovftw commented 3 months ago

Here is simple implementation of time-weighted EMA, compared to EMA:

import numpy as np
import matplotlib.pyplot as plt

# Example dataset
times = np.array([0, 1, 2, 2.001, 2.002, 2.003, 2.004, 2.005, 2.006, 3, 4])
values = np.array([1, 2, 3, 3, 3, 3, 3, 3, 3, 4, 5])

# Function to calculate traditional EMA
def calculate_ema(values, alpha):
    ema = [values[0]]
    for i in range(1, len(values)):
        ema.append(alpha * values[i] + (1 - alpha) * ema[-1])
    return np.array(ema)

# Function to calculate time-weighted EMA
def calculate_time_weighted_ema(times, values, alpha):
    ema = [values[0]]
    for i in range(1, len(values)):
        dt = times[i] - times[i - 1]
        alpha_adjusted = 1 - np.exp(-alpha * dt)
        ema.append(alpha_adjusted * values[i] + (1 - alpha_adjusted) * ema[-1])
    return np.array(ema)

# Base smoothing factor
alpha = 0.1

# Calculate EMAs
ema_basic = calculate_ema(values, alpha)
ema_time_weighted = calculate_time_weighted_ema(times, values, alpha)

# Plot results
plt.figure(figsize=(10, 6))
plt.plot(times, values, 'o-', label='Original data')
plt.plot(times, ema_basic, 's-', label='Basic EMA')
plt.plot(times, ema_time_weighted, 'x-', label='Time-weighted EMA')
plt.title('Comparison of Basic EMA and Time-weighted EMA')
plt.xlabel('Time')
plt.ylabel('Value')
plt.legend()
plt.grid(True)
plt.show()

image