Hello! Thank you for a great work! It seems like a mistake in Chapter 21 - Nonparametric Curve Estimation in function j_hat_kde, since as mentioned in the book and in your text K^(2) is N(0,2) the function should look like (also the comment contains mistake too, h is a bandwidth), however the comment contains the information that dataset is rescaled to [0, 1], but I did not find any evidence for such transformation in the book as well as code of this function does not contain it, please could you double check this:
def j_hat_kde(X, h):
"""
Calculate the approximated estimated KDE risk J_hat for a N(0, 1) Gaussian kernel
\hat{J}(h) = \frac{1}{hn^2}\sum_{i, j} K^* \left( \frac{X_i - X_j}{h} \right) + \frac{2}{nh} K(0)
where:
n is the dataset size
h is the bandwidth for the rescaled [0, 1] dataset
K^* is K^{(2)}(x) - 2 K(x), and K^{(2)} is the convolved kernel, K^{(2)}(z) = \int K(z - y) K(y) dy
K is the original kernel
"""
n = len(X)
Kstar_args = np.array([X.iloc[i] - X.iloc[j] for i, j in product(range(n), range(n))]) / h
sum_value = np.sum(norm.pdf(Kstar_args, loc=0, scale = 2) - 2 * norm.pdf(Kstar_args, loc=0, scale = 1))
return sum_value / (h * n * n) + 2 * norm.pdf(0) / (n * h)
Hello! Thank you for a great work! It seems like a mistake in Chapter 21 - Nonparametric Curve Estimation in function j_hat_kde, since as mentioned in the book and in your text K^(2) is N(0,2) the function should look like (also the comment contains mistake too, h is a bandwidth), however the comment contains the information that dataset is rescaled to [0, 1], but I did not find any evidence for such transformation in the book as well as code of this function does not contain it, please could you double check this: