neurospin / pylearn-parsimony_history

Sparse and Structured Machine Learning in Python
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Losses function: Mean (square) losses intead of sum (square) loss #16

Closed duchesnay closed 10 years ago

duchesnay commented 10 years ago

Parsimony losses are is sum (square) losses. See pylearn-parsimony / parsimony / functions / losses.py

Such losses depends on the number of sample, which make difficult to tune the penalties coefficients. Using "Mean" losses will make life easier when it comes to tune the penalties parameters. Indeed the contribution of the loss in the global objective function will remain the same whatever the size of the dataset.

Such choice have been done in the R "glmnet" package (Friedman, Hastie, Tibshirani) where they minimize:

Consequence: RidgeRegression should only be divided by n RidgeLogisticRegression should use 1/n as default weigths

tomlof commented 10 years ago

I agree. But I had a discussion with Vincent and Vincent about this, and I was voted down. It seems both definitions are used interchangeably. Note that this applies to the ridge, or L2 penalty as well.

Two possible changes we can do:

  1. We add a penalty MeanSquaresLoss, that implements what you propose.
  2. We split RidgeRegression in SumOfSquaresLoss and L2 penalty to make it more explicit what it does.

Personally I think we should do both. What do you think?

tomlof commented 10 years ago

Another possibility is that we add an argument mean=True, that tells us whether or not to compute the mean square loss or just the squared loss.

Thoughts on that?

tomlof commented 10 years ago

I have started to add the last option (where mean=True is a default argument to the constructor) to many functions. I'll add them when I need them, so all will have it in time.

Let me know if you don't want us to do it this way, so that I don't spend time on something we won't use.