Closed Miaoranmmm closed 6 months ago
Thanks for catching this. 66.2 should be replaced with 56.3 in Table 3.
Besides this, we found that the latest GPT-3.5-Turbo version can achieve better performance on this compared to the previous version. So you might consider a quick evaluation on the latest version of GPT-3.5-Turbo if it is relevant to your work. Thanks!
Hi, I am confused about the difference between the results in Tables 3 and 4 in the paper. Why is there a huge performance gap for ChatGPT-ZS on the AGGREFACT-CNN-FTSOTA (66.2 v.s. 56.3)?