Open PetrToman opened 12 years ago
Yes I agree, they very much can overfit. I will email Jeff about fixing that page. And yes a more advanced SVM Search would be a nice improvement.
In my work I was able to get good results by combining rectangular search with genetic optimization. But I agree, rectangular search by itself is not very good and requires a lot of trial and error.
Currently, SVMSearch performs something like two nested for loops and tries to find a combination of parameters that satisfies the required maximum error. While it uses constant steps for both C and Gamma, this can be a very lengthy process (esp. when the ranges are big) and finding optimal values is not guaranteed.
A better algorithm should be implemented, for example: 'A pattern search (also known as a “compass search” or a “line search”) starts at the center of the search range and makes trial steps in each direction for each parameter. If the fit of the model improves, the search center moves to the new point and the process is repeated. If no improvement is found, the step size is reduced and the search is tried again. The pattern search stops when the search step size is reduced to a specified tolerance.' (http://www.dtreg.com/svm.htm)
Another link for inspiration: http://stackoverflow.com/questions/2761240/how-to-figure-out-optimal-c-gamma-parameters-in-libsvm (includes link to "The Entire Regularization Path for the Support Vector Machine" paper).
[ Footnote: Claiming "Support vector machines do not suffer from overfitting" (http://www.heatonresearch.com/wiki/Support_Vector_Machine) is not quite correct as it depends on the parameters. ]