xv44586 / toolkit4nlp

transformers implement (architecture, task example, serving and more)
Apache License 2.0
97 stars 18 forks source link

InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: You must feed a value for placeholder tensor 'mlm_loss_sample_weights_3' with dtype float and shape [?] [[{{node mlm_loss_sample_weights_3}}]] [[Mean_3/_7863]] (1) Invalid argument: You must feed a value for placeholder tensor 'mlm_loss_sample_weights_3' with dtype float and shape [?] [[{{node mlm_loss_sample_weights_3}}]] #7

Open uRENu opened 3 years ago

uRENu commented 3 years ago

您好,我运行代码的环境是: tf-gpu 1.15 keras 2.3.1 toolkit4nlp 0.5.0 一直在计算loss时报错,请问是版本原因还是loss计算原因呢

uRENu commented 3 years ago

重新检查了一下,发现应该是数据输入的格式问题。但是又出现了个新问题train_model = Model( bert.model.inputs + [token_ids, is_masked], [mlm_loss, mlm_acc] ) ;ValueError: Output tensors to a Model must be the output of a Keras Layer (thus holding past layer metadata). Found: <function mlm_loss at 0x7fa2f519e8c0>

uRENu commented 3 years ago

最后通过改变datagenerator的输出解决了 class data_generator(DataGenerator): """数据生成器 """ def iter(self, shuffle=False): batch_token_ids, batch_segment_ids, batch_output_ids,batch_is_masked = [], [], [],[] y=[] for isend, (text1,text2,) in self.get_sample(shuffle):

        token_ids, segment_ids, output_ids = sample_convert(
            text1, text2)
        is_masked = [0 if i == 0 else 1 for i in output_ids]
        batch_token_ids.append(token_ids)
        batch_segment_ids.append(segment_ids)
        batch_output_ids.append(output_ids)
        batch_is_masked.append(is_masked)
        y.append([0.])

        if is_end or  len(batch_token_ids) == self.batch_size:
            batch_token_ids = pad_sequences(batch_token_ids,maxlen=maxlen)
            batch_segment_ids = pad_sequences(batch_segment_ids,maxlen=maxlen)
            batch_output_ids = pad_sequences(batch_output_ids,maxlen=maxlen)
            batch_is_masked = pad_sequences(batch_is_masked,maxlen=maxlen)

            yield [batch_token_ids, batch_segment_ids, batch_output_ids, batch_is_masked], np.array(y)
            batch_token_ids, batch_segment_ids, batch_output_ids, batch_is_masked = [], [], [], []
            y=[]

只要np.array(y)地方不是None 就不会报错 我理解的是这部分的值可以随便传一个,因为在计算mlm_loss的时候始终返回的是计算的loss. def mlm_loss(inputs): """计算loss的函数,需要封装为一个层 """ y_true, y_pred, mask = inputs y_true = K.cast(y_true, K.floatx()) mask = K.cast(mask, K.floatx()) loss = K.sparse_categorical_crossentropy( y_true, y_pred, from_logits=True ) loss = K.sum(loss * mask) / (K.sum(mask) + K.epsilon()) return loss mlm_loss = Lambda(mlm_loss, output_shape=(None, ),name='mlm_loss')([token_ids, proba, is_masked])

train_model = Model( bert.model.inputs + [token_ids, is_masked], [mlm_loss])

loss = { 'mlm_loss': lambda y_true, y_pred: y_pred, # 只返回y_pred, y_pred就是mlm_loss }

xv44586 commented 3 years ago

运行的哪个脚步?具体错误信息是什么?你这样无头无尾的我好难猜啊

uRENu commented 3 years ago

运行的哪个脚步?具体错误信息是什么?你这样无头无尾的我好难猜啊

不好意思,我是在您的pretraining.py的基础上自己写了一个带有数据处理的pretrain代码,因为在data_generator没有输出mlm_loss和mlm_acc的预设值,直接在这部分输出了None,所以遇到了报错: InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: You must feed a value for placeholder tensor 'mlm_loss_sample_weights_3' with dtype float and shape [?] [[{{node mlm_loss_sample_weights_3}}]] [[Mean_3/_7863]] (1) Invalid argument: You must feed a value for placeholder tensor 'mlm_loss_sample_weights_3' with dtype float and shape [?] [[{{node mlm_loss_sample_weights_3}}]] image

后来我把这里的None改为了一个设定的值(0.0)就没有错误了

uRENu commented 3 years ago

我遇到的另一个问题是在使用 toolkit4nlp.optimizers 时,应用wramup有问题,他并没有按我设定的值来增加学习率: image 我设置了lr_schedule={int(len(train_generator) epochs 0.1): 1.0, len(train_generator) * epochs: 0.1},我的epochs是200,按我的理解应该是前20个epoch的学习率会递增到设定值(我设的是5e-5),后边的是按设定值的0.1倍学习率训练。但是我查看训练过程中的学习率值时,发现他一开始就以学习率5e-5训练了。 image