yuh-zha / AlignScore

ACL2023 - AlignScore, a metric for factual consistency evaluation.
https://arxiv.org/pdf/2305.16739.pdf
MIT License
109 stars 18 forks source link

NLI evlaution mode doesn't provide 3 classes #8

Open jacopobandonib opened 10 months ago

jacopobandonib commented 10 months ago
from alignscore import AlignScore
scorer = AlignScore(model='roberta-large', batch_size=1, ckpt_path ="models/AlignScore/checkpoint/AlignScore-large.ckpt", device= "cpu", evaluation_mode='nli')

print(scorer.score(contexts=['hello world.'], claims=['hello world.']))
Evaluating: 100%|██████████| 1/1 [00:01<00:00,  1.96s/it]
[0.9887175559997559]

During loading model checkpoint I as well got warned about not initialized model weights, as outlined in the previous issue (https://github.com/yuh-zha/AlignScore/issues/7 ).

cs329yangzhong commented 8 months ago

From their codebase, the inference function for NLI only returns the score for "Aligned" class (the first class of the three). You may need to modify the inference code a bit to obtain all three.