PaddlePaddle / models

Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.
Apache License 2.0
6.9k stars 2.91k forks source link

在train.py中实现的交叉熵损失不下降 #3417

Open SnowWhite11235 opened 4 years ago

SnowWhite11235 commented 4 years ago

您好,我自己再train.py中实现了带有ignore_labels的交叉熵函数,但是训练过程中损失函数不收敛,请问是哪里写的不太对吗?或者是反传需要什么特殊处理?用API的交叉熵是没有问题的。 (其中mask是一个过滤掉label小于-1的位置的模板) def cross_entropy_loss(fc, labels_int32, class_num):

labels: int32 to int64 and double

labels_double = fluid.layers.cast(x=labels_int32, dtype='double')
labels_int64  = fluid.layers.cast(x=labels_int32, dtype='int64')
# generate mask
ignore_labels = fluid.layers.fill_constant(shape=[1], value=0, dtype='int64') # mask thres
mask_bool = fluid.layers.greater_equal(x=labels_int64, y=ignore_labels)
mask_int = fluid.layers.cast(x=mask_bool, dtype='int64')
mask_float = fluid.layers.cast(x=mask_bool, dtype='float32')
# softmax the input fc
fc_softmax = fluid.layers.softmax(input=fc)
# change -1 to 1 in label
labels_double_abs = paddle.fluid.layers.abs(x=labels_double)
labels_int64_abs = fluid.layers.cast(x=labels_double_abs, dtype='int64')
labels_int64_abs.stop_gradient=True # must
# calculate loss
labels = fluid.layers.one_hot(labels_int64_abs, depth=class_num)
loss = -1.0 * fluid.layers.log(fc_softmax) * labels
# mask the loss
loss = fluid.layers.elementwise_mul(loss, mask_float)
loss_mean = fluid.layers.reduce_sum(loss)
return loss_mean
FDInSky commented 4 years ago

经过排查,这个跟这个函数的逻辑应该有问题,你们需要再进一步确认