Closed danjenson closed 5 months ago
Currently, the book states that we can estimate the generalization gap with Equation 5.84:
\mathbb{E}_{p^*}[R(f^*_N)-R(f^*)]\approx\mathbb{E}_{p_\text{tr}}[\ell(\mathbf{y}, f^*_N(\mathbf{x})]-\mathbb{E}_{p_\text{te}}[\ell(\mathbf{y},f^*_N(\mathbf{x}))]
However, I would expect the test loss to be higher than the training loss, which would make the RHS negative. Should the terms on the RHS be swapped?
You are right. In fact there is a more serious issue in how I defined the generalization gap. This is now fixed - see below.
Currently, the book states that we can estimate the generalization gap with Equation 5.84:
However, I would expect the test loss to be higher than the training loss, which would make the RHS negative. Should the terms on the RHS be swapped?