Open regzhuce opened 7 years ago
The strategy I've used is to build a custom loss function using the MakeLoss operator and feeding it with the weights, For example: loss = MakeLoss(weight * mx.symbol.square(label-pred))
It's so strange when predicting, I have to feed a constant weight to the model, and get a loss value but the prediction value.
The default behavior when using predict
function on a model with MakeLoss is to get the inference on the latest layer, which is the one where the loss function is defined. You can either retro-fit the the actual predictions knowing the labels and weight, or more simply, get the predictions for the layer previous to MakeLoss where the preds are defined.
It's so trivial. Hopefully we can have more graceful approach encapuslated. That would be very nice.
@regzhuce The output of Makeloss is the gradient. See https://github.com/apache/incubator-mxnet/blob/master/src/operator/make_loss.cc#L35
@thirdwing Thanks Any proposals for my problem?
Can you give more details on what you mean by "set instance weight"?
I am sorry that I don't understand your problem.
On 10 Aug 2017 8:10 p.m., "reg.zhuce" notifications@github.com wrote:
@thirdwing https://github.com/thirdwing Thanks Any proposals for my problem?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apache/incubator-mxnet/issues/7375#issuecomment-321723588, or mute the thread https://github.com/notifications/unsubscribe-auth/ABebVczWwmvWZBZDoSqE8DBQs5w3Q6vzks5sW8YagaJpZM4OwL1A .
Say, I got lots of samples, but not all samples are the same importance. I wanna to give every sample an importance, i.e. instance weight.
I am also interested in how I can weight the samples. I have a binary classification problem and I wanted to give the samples or classes different weights. Is there no other way than using a custom loss function?
Unfortunately, the tutorial for the custom loss function is not sufficient to see how to use this function in a different setup. Is there a way to fully replace mx.symbol.<...>Output
without the need of additional steps afterwards to get the prediction? I would like to get model performance during training on a validation data set. Thus, I need the predictions during training and I do not know how to get them if I use MakeLoss
.
Any help is highly appreciated!
@regzhuce : Probably this could help you:
I finally could figure out how to use class weights for a (binary) classification problem though I still do not know how to achieve the functionality of a mx.symbol.<...>Output
layer to return the loss gradient and the prediction. However, here is my code to use a weighted version of cross-entropy when having two classes:
# ... other layers, last layer's name is 'last_layer'
# Fully connected layer with 2 nodes
fc_last <- mx.symbol.FullyConnected(data=last_layer, num_hidden=2, name='lastfullyconnected')
# Label variable
label <- mx.symbol.Variable(name='label')
# Softmax
softmax <- mx.symbol.softmax(data=fc_last, name='softmax', axis=1)
# Weighted cross-entropy
# label_weight in (0, 1), 1e-6 is added to avoid log(0)
nn_out <- mx.symbol.MakeLoss(
-1 * (1 - label_weight) * (1 - label) * mx.symbol.log(mx.symbol.Reshape(mx.symbol.slice_axis(softmax, axis=1, begin=0, end=1), shape = 0) + 1e-6) -
label_weight * label * mx.symbol.log(mx.symbol.Reshape(mx.symbol.slice_axis(softmax, axis=1, begin=1, end=2), shape = 0) + 1e-6),
name='weightedcrossentropy'
)
After training, the same approach can be used to obtain predictions as described in this example for a regression task.
@thirdwing: It would be nice to get a confirmation whether this is a valid example for using class weights on softmax output as, unfortunately, there is no tutorial for this case.
@regzhuce Hope your question was answered by the above comment.
@sandeep-krishnamurthy Can you please close this issue ?
I face the same problem. I think the situation @regzhuce mentioned can be abstrated as: mannually assign weights to the loss of different samples.
In the mxnet.sym
API about the SoftmaxOutput http://beta.mxnet.io/r/api/mx.symbol.SoftmaxOutput.html, I cannot find a proper solution.
I have to implement this idea in the symbolic API.
Same problem here. I am looking for a drop-in replacement for mx.sym.SoftmaxOutput
that somehow allows weighting examples in a batch individually. Something like
mx.sym.WeightedSoftmaxOutput(data=logits,
label=labels,
weights=weights,
ignore_label=ignore_label,
use_ignore=True,
normalization=normalization,
smooth_alpha=smooth_alpha,
name=name)
@thirdwing Why did you tag this issue with R
?
@piyushghai the example given by VGalata is not exactly what the issue is about, namely instance weights instead of class weights.
Here is a gist with an actual implementation of batch-weighted cross-entropy loss that I believe can replace the default SoftmaxOutput
, but will be less efficient, for instance if label smoothing is used:
https://gist.github.com/bricksdont/812b4d6a21ab045da771560ec9af8c11
Is there any way that I can set a weight for every instance when I train the model? I just cannot find any doc about this.