Open shigapov opened 3 months ago
I train a NER&NEL model according to the tutorial https://flairnlp.github.io/flair/master/tutorial/tutorial-training/how-to-train-span-classifier.html. However, if in SpanClassifier() there is a custom "candidates=CandidateGenerator(candidates=candidates)" with custom candidates-dictionary, then the loss-function for SpanClassifier is always "inf" during training. I assume that it comes from https://github.com/flairNLP/flair/blob/e17ab1234fcfed2b089d8ef02b99949d520382d2/flair/models/entity_linker_model.py#L230. If I replace that line with
masked_scores = -torch.ones(scores.size(), requires_grad=True, device=flair.device)
the loss-values for SpanClassifier during training are not inf.
Is there a better fix for that?
from flair.data import Sentence from flair.datasets import ColumnCorpus from flair.models import SequenceTagger, SpanClassifier from flair.trainers import ModelTrainer from flair.models.entity_linker_model import CandidateGenerator from flair.nn import PrototypicalDecoder from flair.nn.multitask import make_multitask_model_and_corpus # Define the columns in the dataset columns_ner = {0: 'text', 1: 'ner'} columns_nel = {0: 'text', 2: 'nel'} # Create the Corpus corpus_ner = ColumnCorpus(training_data_folder, columns_ner, name='ner', train_file='train.txt', test_file='test.txt', dev_file='dev.txt') corpus_nel = ColumnCorpus(training_data_folder, columns_nel, name='nel', train_file='train.txt', test_file='test.txt', dev_file='dev.txt') corpus_ner.obtain_statistics() corpus_nel.obtain_statistics() # Create the label dictionary ner_label_dict = corpus_ner.make_label_dictionary("ner", add_unk=False) nel_label_dict = corpus_nel.make_label_dictionary("nel", add_unk=True) shared_embeddings = TransformerWordEmbeddings("dbmdz/bert-tiny-historic-multilingual-cased", fine_tune=True, layers='-1', use_context=True) ner_model = SequenceTagger( embeddings=shared_embeddings, tag_dictionary=ner_label_dict, tag_type="ner", use_rnn=False, use_crf=False, reproject_embeddings=False, ) nel_model = SpanClassifier( embeddings=shared_embeddings, label_dictionary=nel_label_dict, label_type="nel", span_label_type="ner", decoder=PrototypicalDecoder( num_prototypes=len(nel_label_dict), embeddings_size=shared_embeddings.embedding_length * 2, # we use "first_last" encoding for spans distance_function="dot_product", ), candidates=CandidateGenerator({'Amerika':['Q828']}), # candidates=CandidateGenerator(candidates=candidates), ) # -- Define mapping (which tagger should train on which model) -- # multitask_model, multicorpus = make_multitask_model_and_corpus( [ (ner_model, corpus_ner), (nel_model, corpus_nel), ] ) # Initialize trainer trainer = ModelTrainer(multitask_model, multicorpus) # Train the model trainer.fine_tune(f"resources/taggers/bert-tiny-german-ra", learning_rate=5e-4, mini_batch_size=32, max_epochs=1)
2024-08-06 16:21:10,682 Task_1 - SpanClassifier - loss: 7.897546768188477 - f1-score (micro avg) 0.0008 2024-08-06 16:21:10,683 DEV : loss 4.1483540534973145 - f1-score (micro avg) 0.2253
2024-08-06 16:18:22,411 Task_1 - SpanClassifier - loss: inf - f1-score (micro avg) 0.0008 2024-08-06 16:18:22,411 DEV : loss inf - f1-score (micro avg) 0.2268
No response
0.14.0
2.3.1+cu121
4.42.4
True
One more problem here: F1-score for SpanClassifier is not changing during training.
Describe the bug
I train a NER&NEL model according to the tutorial https://flairnlp.github.io/flair/master/tutorial/tutorial-training/how-to-train-span-classifier.html. However, if in SpanClassifier() there is a custom "candidates=CandidateGenerator(candidates=candidates)" with custom candidates-dictionary, then the loss-function for SpanClassifier is always "inf" during training. I assume that it comes from https://github.com/flairNLP/flair/blob/e17ab1234fcfed2b089d8ef02b99949d520382d2/flair/models/entity_linker_model.py#L230. If I replace that line with
the loss-values for SpanClassifier during training are not inf.
Is there a better fix for that?
To Reproduce
Expected behavior
2024-08-06 16:21:10,682 Task_1 - SpanClassifier - loss: 7.897546768188477 - f1-score (micro avg) 0.0008 2024-08-06 16:21:10,683 DEV : loss 4.1483540534973145 - f1-score (micro avg) 0.2253
Logs and Stack traces
Screenshots
No response
Additional Context
No response
Environment
Versions:
Flair
0.14.0
Pytorch
2.3.1+cu121
Transformers
4.42.4
GPU
True