Liyan06 / AggreFact

Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors (ACL 2023)
19 stars 1 forks source link

Question regarding the difference between Table 3 and 4 in the paper #14

Closed Miaoranmmm closed 6 months ago

Miaoranmmm commented 6 months ago

Hi, I am confused about the difference between the results in Tables 3 and 4 in the paper. Why is there a huge performance gap for ChatGPT-ZS on the AGGREFACT-CNN-FTSOTA (66.2 v.s. 56.3)?

Liyan06 commented 6 months ago

Thanks for catching this. 66.2 should be replaced with 56.3 in Table 3.

Besides this, we found that the latest GPT-3.5-Turbo version can achieve better performance on this compared to the previous version. So you might consider a quick evaluation on the latest version of GPT-3.5-Turbo if it is relevant to your work. Thanks!