Louis-udm / NER-BERT-CRF

MIT License
222 stars 53 forks source link

accuracy to 0 when CRF_only=False #1

Closed shawei3000 closed 5 years ago

shawei3000 commented 5 years ago

Hi, I borrowed your functions, as below, for training CoNLL_2003 data only. when CRF=True, result is ~ 95% for F1, but when I changed to CRF=False, the accuracy reduced to almost 0, wondering if I am doing something wrong:

Code:

def create_model(bert_config, is_training, input_ids, input_mask, segment_ids, labels, num_labels, use_one_hot_embeddings, dropout_rate=1.0, lstm_size=1, cell='lstm', num_layers=1):

  model = modeling.BertModel(
    config=bert_config,
    is_training=is_training,
    input_ids=input_ids,
    input_mask=input_mask,
    token_type_ids=segment_ids,
    use_one_hot_embeddings=use_one_hot_embeddings
)

embedding = model.get_sequence_output()
max_seq_length = embedding.shape[1].value

used = tf.sign(tf.abs(input_ids))
lengths = tf.reduce_sum(used, reduction_indices=1) 

blstm_crf = BLSTM_CRF(embedded_chars=embedding, hidden_unit=lstm_size, cell_type=cell, num_layers=num_layers, dropout_rate=dropout_rate, initializers=initializers, num_labels=num_labels,  seq_length=max_seq_length, labels=labels, lengths=lengths, is_training=is_training)

loss, logits, trans, pred_ids = blstm_crf.add_blstm_crf_layer(crf_only=False)

with tf.variable_scope("loss"):
    log_probs = tf.nn.log_softmax(logits, axis=-1)
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_sum(per_example_loss)
    probabilities = tf.nn.softmax(logits, axis=-1)
    predict = tf.argmax(probabilities,axis=-1)
    return (loss, per_example_loss, logits,predict)
Louis-udm commented 5 years ago

Sorry I have seen this issue before, I think now you have figured it out. I updated something(new hyper parameter, and uncased version, f1-score etc) , you may try the new version again. @shawei3000