Closed Wenshansilvia closed 7 months ago
During evaluation of all metrics, detail log info should be written to local file using logger. Two variable should be returned by the evaluate function:
instance level score of each metric
>>> import rageval as rl >>> ds = load_dataset("testset_name") >>> result, instance_level_result = evaluate( ds.select(range(3)), metrics=[ContextRecall(), AnswerGroundedness()], models = [cr_model, ag_model] ) >>> result Dataset({ features: ['context_recall', 'answer_groundedness'], num_rows: 2 }) >>> instance_level_result Dataset({ features: ['questions', 'gt_answers', 'answers', 'contexts', 'context_recall', 'answer_groundedness'], num_rows: 9 }))
Maybe dict is enough for result, hahh...
dict
During evaluation of all metrics, detail log info should be written to local file using logger. Two variable should be returned by the evaluate function:
instance level score of each metric