p.131. In example 6.3, "Here a of run normalized gradient descent ..." should be "Here a run of normalized gradient descent ...".
p.136. Equation 6.18 is not consistent with how tanh is usually defined. The consistent one should be "tanh(x) = 2 sigma(2x) - 1".
p.145. The inequality "exp(-yxCw) = exp(-C)exp(-yxw) < exp(-yxw)" is incorrect. It should be "exp(-yxCw) = (exp(-yxw))^C < exp(-yxw)"
p.158. The encoding in equation 6.62 does not make equation 6.66 and equation 6.67 equivalent to each other. Instead, the encoding should be done reversely: encode the label 1 into the vector [1 0]. (If consistency with section 7.5.4 is intended, then some other contents, instead of the encoding, need modification.)
p.168. "Section 6.24" is mentioned, but there is no such section.
I'm not confident that these are all errors/typos indeed. Hope they help.
Besides, this is a great machine learning book for beginners. I do understand what's going on in the text. Deep appreciation for the effort.
Some suspected errors/typos to report:
p.131. In example 6.3, "Here a of run normalized gradient descent ..." should be "Here a run of normalized gradient descent ...". p.136. Equation 6.18 is not consistent with how tanh is usually defined. The consistent one should be "tanh(x) = 2 sigma(2x) - 1". p.145. The inequality "exp(-yxCw) = exp(-C)exp(-yxw) < exp(-yxw)" is incorrect. It should be "exp(-yxCw) = (exp(-yxw))^C < exp(-yxw)" p.158. The encoding in equation 6.62 does not make equation 6.66 and equation 6.67 equivalent to each other. Instead, the encoding should be done reversely: encode the label 1 into the vector [1 0]. (If consistency with section 7.5.4 is intended, then some other contents, instead of the encoding, need modification.) p.168. "Section 6.24" is mentioned, but there is no such section.
I'm not confident that these are all errors/typos indeed. Hope they help. Besides, this is a great machine learning book for beginners. I do understand what's going on in the text. Deep appreciation for the effort.