brightmart / roberta_zh

RoBERTa中文预训练模型: RoBERTa for Chinese
2.63k stars 409 forks source link

部分参数无法重载 #65

Open WBY1993 opened 4 years ago

WBY1993 commented 4 years ago

"""Get loss and log probs for the masked LM.""" input_tensor = gather_indexes(input_tensor, positions)

with tf.variable_scope("cls/predictions"):

We apply one more non-linear transformation before the output layer.

# This matrix is not used after pre-training.
with tf.variable_scope("transform"):
  input_tensor = tf.layers.dense(
      input_tensor,
      units=bert_config.hidden_size,
      activation=modeling.get_activation(bert_config.hidden_act),
      kernel_initializer=modeling.create_initializer(
          bert_config.initializer_range))
  input_tensor = modeling.layer_norm(input_tensor)

# The output weights are the same as the input embeddings, but there is
# an output-only bias for each token.
output_bias = tf.get_variable(
    "output_bias",
    shape=[bert_config.vocab_size],
    initializer=tf.zeros_initializer())
logits = tf.matmul(input_tensor, output_weights, transpose_b=True)
logits = tf.nn.bias_add(logits, output_bias)

这一部分的参数在12层的roberta的ckpt中是没有保存吗,重载的时候没有。