macss-modeling / General-Questions

A repo to post questions about code, data, etc.
0 stars 0 forks source link

Question on Support Vector Classifier 'C' Parameter #21

Closed mikepackard415 closed 3 years ago

mikepackard415 commented 3 years ago

Hi there,

In Thursday's lecture notes, on slide 30, the cost hyperparameter C is described: "a cost or penalty to having cases inside the margin, which is, in effect the budget of errors allowed." This description is confusing, because intuitively a cost and budget should be inversely related: If violating the margin imposes a low cost, you can budget for many violations, but if it imposes a high cost, you can only budget for few violations. The mathematical definition aligns closer with C being considered the "budget."

I am further confused by the description of how C controls the bias-variance tradeoff.

In the notes (slide 30):

But in the ISL reading, page 347:

I think what this amounts to is that on the slides, when you say "low cost," this amounts to "large C" and "high cost" amounts to "small C". This seems kind of backwards.

I hope I have explained this confusion well enough! Please let me know if I'm missing something here.

pdwaggoner commented 3 years ago

Yes, we discussed this just after class on Thursday. Basically, the confusion here is in how most (all?) packages have programmed cost in modal SVM implementations. That is, for greater intuition, the idea is higher cost means fewer cases in the margin, and lower cost means less penalty, and more cases in the margin. This is precisely was we observed when running the code and tuning cost in class. But the book is speaking from a more formal (and correct) approach, of defining cost on the basis of unique \epsilon values, which record the distance of points to their true class margins. Cost in this sense is the total budget that controls how many mistakes are allowed by the classifier. So the ISL text is very much correct, but presents the problem a different way than most software implementations of SVM. I hope this clarifies.

mikepackard415 commented 3 years ago

Yes thank you! Much appreciated.

Raychanan commented 3 years ago

Hi Professor Waggoner, so I believe we're supposed to refer to the ISL text when answering relevant questions (if any) in the final exam, right? Thanks!

pdwaggoner commented 3 years ago

If asked, it will be clear (e.g., referring to tuning a hyperparameter when fitting a model vs. a theoretical definition. But in general, yes defer to the text in this type of situation.