Hi! Thank you for making and maintaining this tutorial! I'm reading through your Softmax Regression Tutorial, and I have a question about the following excerpt of the section entitled Properties of softmax regression parameterization:
Indeed, rather than optimizing over the K⋅n parameters (θ(1),θ(2),…,θ(K)) (where θ(k)∈ℜ^n), one can instead set θ(K)=0⃗ and optimize only with respect to the K⋅n remaining parameters.
Should the second part of this sentence instead read "optimize only with respect to the (K - 1)⋅n remaining parameters"?
Hi! Thank you for making and maintaining this tutorial! I'm reading through your Softmax Regression Tutorial, and I have a question about the following excerpt of the section entitled Properties of softmax regression parameterization:
Should the second part of this sentence instead read "optimize only with respect to the (K - 1)⋅n remaining parameters"?
Thank you!