Question Regarding Evaluation Methodology in Books Dataset

Hello again, I hope this message finds you well.

In your paper, you describe the evaluation methodology for the Books dataset as checking for a match between the elements (author name, release year) in the answer. However, in the code, the answer_args for the Books dataset also include the publisher. Could you please explain why the publisher information is included in the answer_args despite the paper not mentioning its use in the evaluation process?

Additionally, the books_answer_heuristic function in the code uses a threshold of 3 for matching elements in the answer. The paper, however, does not specify why this threshold was chosen over a threshold of 2, which would seem more consistent with the two elements (author name and release year) mentioned. Could you provide some insight into the rationale behind setting the threshold to 3?

Understanding these choices would be very helpful for my own research, as I am looking to replicate and build upon the methods used in your study.

Thank you very much for your time and assistance. I am looking forward to your response.

yakir-yehuda / InterrogateLLM

Question Regarding Evaluation Methodology in Books Dataset #2