InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: You must feed a value for placeholder tensor 'mlm_loss_sample_weights_3' with dtype float and shape [?] [[{{node mlm_loss_sample_weights_3}}]] [[Mean_3/_7863]] (1) Invalid argument: You must feed a value for placeholder tensor 'mlm_loss_sample_weights_3' with dtype float and shape [?] [[{{node mlm_loss_sample_weights_3}}]]

uRENu commented 3 years ago

您好，我运行代码的环境是： tf-gpu 1.15 keras 2.3.1 toolkit4nlp 0.5.0 一直在计算loss时报错，请问是版本原因还是loss计算原因呢

uRENu commented 3 years ago

重新检查了一下，发现应该是数据输入的格式问题。但是又出现了个新问题train_model = Model( bert.model.inputs + [token_ids, is_masked], [mlm_loss, mlm_acc] ) ；ValueError: Output tensors to a Model must be the output of a Keras Layer (thus holding past layer metadata). Found: <function mlm_loss at 0x7fa2f519e8c0>

uRENu commented 3 years ago

最后通过改变datagenerator的输出解决了 class data_generator(DataGenerator): """数据生成器 """ def iter(self, shuffle=False): batch_token_ids, batch_segment_ids, batch_output_ids,batch_is_masked = [], [], [],[] y=[] for isend, (text1,text2,) in self.get_sample(shuffle):

        token_ids, segment_ids, output_ids = sample_convert(
            text1, text2)
        is_masked = [0 if i == 0 else 1 for i in output_ids]
        batch_token_ids.append(token_ids)
        batch_segment_ids.append(segment_ids)
        batch_output_ids.append(output_ids)
        batch_is_masked.append(is_masked)
        y.append([0.])

        if is_end or  len(batch_token_ids) == self.batch_size:
            batch_token_ids = pad_sequences(batch_token_ids,maxlen=maxlen)
            batch_segment_ids = pad_sequences(batch_segment_ids,maxlen=maxlen)
            batch_output_ids = pad_sequences(batch_output_ids,maxlen=maxlen)
            batch_is_masked = pad_sequences(batch_is_masked,maxlen=maxlen)

            yield [batch_token_ids, batch_segment_ids, batch_output_ids, batch_is_masked], np.array(y)
            batch_token_ids, batch_segment_ids, batch_output_ids, batch_is_masked = [], [], [], []
            y=[]

只要np.array(y)地方不是None 就不会报错我理解的是这部分的值可以随便传一个，因为在计算mlm_loss的时候始终返回的是计算的loss. def mlm_loss(inputs): """计算loss的函数，需要封装为一个层 """ y_true, y_pred, mask = inputs y_true = K.cast(y_true, K.floatx()) mask = K.cast(mask, K.floatx()) loss = K.sparse_categorical_crossentropy( y_true, y_pred, from_logits=True ) loss = K.sum(loss * mask) / (K.sum(mask) + K.epsilon()) return loss mlm_loss = Lambda(mlm_loss, output_shape=(None, ),name='mlm_loss')([token_ids, proba, is_masked])

train_model = Model( bert.model.inputs + [token_ids, is_masked], [mlm_loss])

loss = { 'mlm_loss': lambda y_true, y_pred: y_pred, # 只返回y_pred, y_pred就是mlm_loss }

xv44586 commented 3 years ago

运行的哪个脚步？具体错误信息是什么？你这样无头无尾的我好难猜啊

uRENu commented 3 years ago

运行的哪个脚步？具体错误信息是什么？你这样无头无尾的我好难猜啊

不好意思，我是在您的pretraining.py的基础上自己写了一个带有数据处理的pretrain代码，因为在data_generator没有输出mlm_loss和mlm_acc的预设值，直接在这部分输出了None,所以遇到了报错： InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: You must feed a value for placeholder tensor 'mlm_loss_sample_weights_3' with dtype float and shape [?] [[{{node mlm_loss_sample_weights_3}}]] [[Mean_3/_7863]] (1) Invalid argument: You must feed a value for placeholder tensor 'mlm_loss_sample_weights_3' with dtype float and shape [?] [[{{node mlm_loss_sample_weights_3}}]]

后来我把这里的None改为了一个设定的值（0.0）就没有错误了

uRENu commented 3 years ago

我遇到的另一个问题是在使用 toolkit4nlp.optimizers 时，应用wramup有问题，他并没有按我设定的值来增加学习率：我设置了lr_schedule={int(len(train_generator) epochs 0.1): 1.0, len(train_generator) * epochs: 0.1}，我的epochs是200，按我的理解应该是前20个epoch的学习率会递增到设定值（我设的是5e-5），后边的是按设定值的0.1倍学习率训练。但是我查看训练过程中的学习率值时，发现他一开始就以学习率5e-5训练了。

xv44586 / toolkit4nlp