shon-otmazgin / fastcoref

MIT License
142 stars 25 forks source link

Not understanding trainer.evaluate() results #40

Closed AtillaKaanAlkan closed 1 year ago

AtillaKaanAlkan commented 1 year ago

Hi @shon-otmazgin !

First of all, thanks for this package !

I have followed the instructions in the « Distil your own coref model » section in order to produce annotations of my corpus.

Here are the steps I followed :

from fastcoref import LingMessCoref

model = LingMessCoref(device='cuda:0')
preds = model.predict(texts=texts, output_file='LingMess_annotations.jsonlines')

It then generated a jsonlines-format file with the following keys for each document in the file: [‘text’, ‘clusters’, ‘clusters_strings’]

Secondly, I give to the trainer its own predictions (the output file generated at the first step) for evaluation purpose :

from fastcoref import TrainingArgs, CorefTrainer

args = TrainingArgs(
    output_dir='test-trainer',
    overwrite_output_dir=True,
    model_name_or_path='distilroberta-base',
    device='cuda:2',
    epochs=129,
    logging_steps=100,
    eval_steps=100
) 

trainer = CorefTrainer(
    args=args,
    train_file='LingMess_annotations.jsonlines', 
    #dev_file='path-to-dev-file',    # optional
    test_file='LingMess_annotations.jsonlines'
)

trainer.evaluate(test=True)

I have not done any fine-tuning so far (so I did not run the trainer.train() command), I just give to the model its own predictions, and thus, I was expecting all the metrics (Precision, Recall and F1 score) to be 100 % because I am comparing two files/predictions which are the same. However, I obtained a score of 0 for each metrics. I am not understanding that point. How is this possible to get this null score ? Am I missing something ?

Thanks a lot for your help!

Best, Atilla

shon-otmazgin commented 1 year ago

you annotated the data using LingMessCoref. Then CorefTrainer create another model of FCorefModel (this is the whole idea of distillation) which is not optimized (yet) for coreference task.

AtillaKaanAlkan commented 1 year ago

Thanks a lot for your answer, it helps me to understand the point I was missing! I got the expected results now! I can close the issue :)