flairNLP / flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)
https://flairnlp.github.io/flair/
Other
13.92k stars 2.1k forks source link

[Bug]: inf-loss for SpanClassifier during training with a custom candidates-dictionary #3521

Open shigapov opened 3 months ago

shigapov commented 3 months ago

Describe the bug

I train a NER&NEL model according to the tutorial https://flairnlp.github.io/flair/master/tutorial/tutorial-training/how-to-train-span-classifier.html. However, if in SpanClassifier() there is a custom "candidates=CandidateGenerator(candidates=candidates)" with custom candidates-dictionary, then the loss-function for SpanClassifier is always "inf" during training. I assume that it comes from https://github.com/flairNLP/flair/blob/e17ab1234fcfed2b089d8ef02b99949d520382d2/flair/models/entity_linker_model.py#L230. If I replace that line with

masked_scores = -torch.ones(scores.size(), requires_grad=True, device=flair.device)

the loss-values for SpanClassifier during training are not inf.

Is there a better fix for that?

To Reproduce

from flair.data import Sentence
from flair.datasets import ColumnCorpus
from flair.models import SequenceTagger, SpanClassifier
from flair.trainers import ModelTrainer
from flair.models.entity_linker_model import CandidateGenerator
from flair.nn import PrototypicalDecoder
from flair.nn.multitask import make_multitask_model_and_corpus

# Define the columns in the dataset
columns_ner = {0: 'text', 1: 'ner'}
columns_nel = {0: 'text', 2: 'nel'}

# Create the Corpus
corpus_ner = ColumnCorpus(training_data_folder, columns_ner, name='ner',
                      train_file='train.txt',
                      test_file='test.txt',
                      dev_file='dev.txt')
corpus_nel = ColumnCorpus(training_data_folder, columns_nel, name='nel',
                      train_file='train.txt',
                      test_file='test.txt',
                      dev_file='dev.txt')

corpus_ner.obtain_statistics()
corpus_nel.obtain_statistics()

# Create the label dictionary
ner_label_dict = corpus_ner.make_label_dictionary("ner", add_unk=False)
nel_label_dict = corpus_nel.make_label_dictionary("nel", add_unk=True)

shared_embeddings = TransformerWordEmbeddings("dbmdz/bert-tiny-historic-multilingual-cased", fine_tune=True, layers='-1', use_context=True)

ner_model = SequenceTagger(
    embeddings=shared_embeddings,
    tag_dictionary=ner_label_dict,
    tag_type="ner",
    use_rnn=False,
    use_crf=False,
    reproject_embeddings=False,
)

nel_model = SpanClassifier(
    embeddings=shared_embeddings,
    label_dictionary=nel_label_dict,
    label_type="nel",
    span_label_type="ner",
    decoder=PrototypicalDecoder(
        num_prototypes=len(nel_label_dict),
        embeddings_size=shared_embeddings.embedding_length * 2, # we use "first_last" encoding for spans
        distance_function="dot_product",
    ),
    candidates=CandidateGenerator({'Amerika':['Q828']}),
#    candidates=CandidateGenerator(candidates=candidates),
)

# -- Define mapping (which tagger should train on which model) -- #
multitask_model, multicorpus = make_multitask_model_and_corpus(
    [
        (ner_model, corpus_ner),
        (nel_model, corpus_nel),
    ]
)

# Initialize trainer
trainer = ModelTrainer(multitask_model, multicorpus)

# Train the model
trainer.fine_tune(f"resources/taggers/bert-tiny-german-ra",
                  learning_rate=5e-4,
                  mini_batch_size=32,
                  max_epochs=1)

Expected behavior

2024-08-06 16:21:10,682 Task_1 - SpanClassifier - loss: 7.897546768188477 - f1-score (micro avg) 0.0008 2024-08-06 16:21:10,683 DEV : loss 4.1483540534973145 - f1-score (micro avg) 0.2253

Logs and Stack traces

2024-08-06 16:18:22,411 Task_1 - SpanClassifier - loss: inf - f1-score (micro avg)  0.0008
2024-08-06 16:18:22,411 DEV : loss inf - f1-score (micro avg)  0.2268

Screenshots

No response

Additional Context

No response

Environment

Versions:

Flair

0.14.0

Pytorch

2.3.1+cu121

Transformers

4.42.4

GPU

True

shigapov commented 3 months ago

One more problem here: F1-score for SpanClassifier is not changing during training.