Open TimotheeMathieu opened 3 years ago
Having a huber loss available as metric makes sense for models fitted with huber loss.
Be aware that the huber loss elicits something in between the median and the expectation, so it is not really clear what you get/estimate. The omnipresent point about MSE not being robust has at least 2 important points:
Last but not least, my all time favorite reference: https://arxiv.org/abs/0912.0902
Thanks for the comments.
@lorentzenchr what I did is not the Huber loss. It is a robust estimator of the mean applied to the squared errors.
I used the MSE only as an example, I can also do a robust version of mean absolute error if I use make_huber_metric(mean_absolute_error, c=9)
, this is very different because our aim is always to estimate the MSE or mean absolute error but while ignoring the outliers. I don't use a different loss function, I use a different way to estimate the mean in MEAN squared error and MEAN absolute error. because the empirical mean is not robust while Huber estimator is robust.
This can be a problem for people used to Huber loss but in fact this is very different and it is also from Huber so I can't really change the name.
If you want to see references, for instance there is Robust location estimator by Huber or more recently Challenging the empirical mean and empirical variance: a deviation study by Catoni.
EDIT : I added an explanation in the user guide that gives some equations to explain this.
@TimotheeMathieu Thanks for the explanation. Now I get it. Something that could be mentioned in the example is the trimmed mean as a simpler entry point to robust estimation.
This PR use Huber robust mean estimator to make a robust metric.
Description: one of the big challenge of robust machine learning is that the usual scoring scheme (cross_validation with MSE for instance) is not robust. Indeed, if the dataset has some outliers, then the test sets in cross_validation may have outliers and then the cross_validation MSE would give us a huge error for our robust algorithm on any corrupted data. This is why for example robust methods cannot be efficient for regression challenges in kaggle, because the error computation is not robust. This PR propose a robust metric that would allow us to compute a robust cross-validation MSE for instance.
Example :
This returns