artidoro / frank

FRANK: Factuality Evaluation Benchmark
MIT License
52 stars 4 forks source link

Evaluate.py for benchmark_data.json #1

Closed dptam closed 3 years ago

dptam commented 3 years ago

Hi,

I saw that new evaluation metrics should be submitted as a benchmark_data.json file with an additional score field. I was also looking at evaluate.py to see how to first check an evaluation metric, and noticed it read in the scores from data/articles-*. I was wondering if evaluate.py can read in benchmark_data.json file to compute scores?

Thanks, Derek

artidoro commented 3 years ago

I am planning to add that functionality today. I'll update you when that's done!

dptam commented 3 years ago

Sounds good. Thanks!

artidoro commented 3 years ago

Hey @dptam! I just refactored the code so it should be easier to evaluate your metric outputs. Let me know if you have any other suggestions.

dptam commented 3 years ago

Thanks. It works now!