asadoughi / stat-learning

Notes and exercise attempts for "An Introduction to Statistical Learning"
http://asadoughi.github.io/stat-learning
2.12k stars 1.62k forks source link

Chapter 4 - Exercise 2 #79

Closed eugene123tw closed 2 years ago

eugene123tw commented 7 years ago

I think the answer you gave is confusing and didn't explain the purpose of doing the transformation. Actually the reason why you should find the $k$th class that will maximize $\delta{k}(x)$ is due to the Bayes Theorem. From the Bayes' Theorem(4.12) we know , for any class $k$, the total probability $\sum\limits{l=1}^{K} \pi{l} f{l}(x)$ for each $p{k}(x)$ is the same. However, the prior probability $\pi{k}$ and the probability $f{k}(x)$ will differ depending on it's $k$. So, the objective is to find the largest $\pi{k}f{k}(x)$ among the range of $(\pi{1}f{1}(x),\dots,\pi{k}f{k}(x), \dots,\pi{K}f{K}(x))$. This will lead us to find the largest $p{k}(x)$ With the logarithm transformation we get $\delta{k}(x) = \log \big(\pi{k}f{k}(x)\big)$. In the end finding the largest $\delta{k}(x)$ among $K$ classes is equivalent to find the largest $p{k}(x)$ among $K$ classes. But the computation for $\delta{k}(x)$ is much easier than $p_{k}(x)$.