target-benchmark / target

TARGET is a benchmark for evaluating Table Retrieval for Generative Tasks such as Fact Verification and Text-to-SQL
https://target-benchmark.github.io
Apache License 2.0
14 stars 6 forks source link

Fact verification implementation #8

Closed jixy2012 closed 6 months ago

jixy2012 commented 6 months ago

Fact verification task completed! metrics available are f1, accuracy, recall, and precision.

In the code right now, I defined the answer possibilities to be true, false, or not enough information. but it doesn't seem that the original dataset (at least tabfact) has such an option for not enough information, each answer is either true or false.

I think we can still keep the not enough information given our retrieval scenario. i'll just have to make some slight modifications to the code bc hf eval doesn't support non binary answer for these metrics.