Will the full code for evaluation available?

bangawayoo commented 4 months ago

Hi! Thank you for the impressive work and for releasing the code!

I noticed that I had to complete evaluation/triviaqa_nonambigqa_evaluation.py in order to reproduce the numbers for the paper. Will you make the full code (script) available in the future?

Thanks.

bangawayoo commented 2 months ago

@ayyyq @neubig @EthanC111 Hi, do you have any updates on this?

Thanks.

ayyyq commented 2 months ago

Hi, I sincerely apologize for the delayed response. Currently, the code requires you to complete a segment to generate responses using the OpenAI API. You can choose your preferred method, such as LangChain, or refer to this script. We have already provided all the necessary prompts and data processing code.

For the evaluation of free-form QA tasks, to reduce API costs, we have combined a rule-based approach (step 2) and ChatGPT evaluation (step 3) as described in this section.

Thank you for your feedback. We will consider refactoring the evaluation code within a month to make it more user-friendly. In the meantime, if you have any questions, please feel free to contact us. Thank you!

GAIR-NLP / alignment-for-honesty

Will the full code for evaluation available? #1