cathywu / rllab

rllab is a framework for developing and evaluating reinforcement learning algorithms, fully compatible with OpenAI Gym.
Other
1 stars 0 forks source link

Note: Understanding explained variance #11

Open cathywu opened 7 years ago

cathywu commented 7 years ago

Summary: Explained variance (EV) captures the fraction of variance explained by the baseline for the return. If there's little variance in the sampled returns (this may happen in later phases of training, e.g.), EV = 1 indicates a constant baseline, whereas EV = 0 indicates a non-constant baseline.

We explain explained variance through odd-looking curves generated by a 1-step MDP.

OneStepNoStateEnv, k=6, no whitening, Explained variance: 2017-04-29-onestepnostateenv-nowhitening-k6-explainedvariance

EV code (with slight modifications) from rllab/misc/special.py:

def explained_variance_1d(ypred, y, epsilon=1e-8):
    assert y.ndim == 1 and ypred.ndim == 1
    vary = np.var(y)
    if np.isclose(vary, 0):
        # TODO(cathywu) why this distinction?
        if np.var(ypred) > 0:
            return 0
        else:
            return 1
    return 1 - np.var(y - ypred) / (vary + epsilon)

Understanding the (odd) curves above: