Closed MarcCote closed 9 years ago
As this breaks every model written up to now, what would you think of keeping _compute_loss
and add another method which would compute each loss?
Also, why change the name? _compute_batch_losses
is misleading as it could refer to "compute the batch loss, meaning the loss of every examples in the dataset. Personally, I'd revert _compute_loss
to what it was and add _compute_losses
to compute individual losses.
This would have the added benefice of not breaking every models written up to now.
I agree to change the name _compute_batch_losses
to _compute_losses
. It make more senses that way. I'll do the changes.
Regarding the break of compatibility, _compute_loss
still exists and can be overridden by subclasses. But you are right, existing Loss
subclasses will fail since they would not implement _compute_losses
.
So now, Loss
subclasses only have to implement either one or both methods: _compute_loss
or _compute_losses
. If one decides to only implement the former, the Loss
subclass can not be used by LossView
. If one decides to only implement the latter, the default behaviour of _compute_loss
will be to take the mean of all individual losses.
Right now,
Loss
subclasses have to override_compute_loss
which return the loss of the batch. I think it will be more useful if it was returning the loss of every example in the batch.This PR changes every loss classes so they return an array of scalars instead of only one scalar. It also changes the name of the method to override:
_compute_batch_losses
.To show how why this modification is useful, I added an new view
LossView
that provides an easy way of computing any loss on a particular dataset using any batch scheduler. Basically, it allows replacing bothviews.RegressionError
andviews.ClassificationError
views respectively by