huggingface / lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
MIT License
845 stars 100 forks source link

Class implementations of faithfulness and extractiveness metrics #323

Closed chuandudx closed 2 months ago

chuandudx commented 2 months ago

This is a refactoring of the faithfulness and extractive functions to separate classes, addressing the TODOs (extractiveness todo and faithfulness todo).

This change was motivated by a need to easily create new instances of these two metrics while setting custom column name as the input to be evaluated against. Currently, input is hardcoded as the "text" column which may not exist or point to the input for different datasets.

I also have a question regarding whether it could be preferred to pass in models (eg. summac, bert) in the constructor? I followed the existing structure of the BertScore class where we don't pass in the bert instance and there is a comment that says # We only initialize on first compute (reference). I followed the same convention in the faithfulness implementation but wanted to understand why this is preferred over passing in the model directly in the constructor? Thanks in advance for the review and feedback :)

clefourrier commented 2 months ago

You'll need to fix the style :)

chuandudx commented 2 months ago

Thanks for the review :) Applied style fixes, and modified Extractiveness stats metric initialization similar to Faithfulness and BertScore.