canghongjian / beam_retriever

[NAACL 2024] End-to-End Beam Retrieval for Multi-Hop Question Answering
https://arxiv.org/abs/2308.08973
Apache License 2.0
81 stars 8 forks source link

confusion about predefined hops #9

Closed Barianc closed 6 months ago

Barianc commented 6 months ago

Hello, thank you very much for your work. However, I have the following questions: In your paper, you mention

Beam Retrieval searches the relevant passage at each step until the highest predicted score falls below a predefined threshold.

So, according to my understanding, the beam retriever will stop automatically at the appropriate step (hop). Yet, you set a fixed hop for each dataset, such as hop = 2 in HotpotQA. Why is that? This framework should be adaptable to variable hops. Or am I misunderstanding your paper or code?

I hope you can clarify my confusion, thanks again!

canghongjian commented 6 months ago

Hi, thanks for your interest! Your understanding is correct, in our implementation, we use a fixed hop specialized by the dataset to submit the highest scores to the leaderboard. As for the method mentioned in the paper, it will cause a performance decrease of approximately 5%. So you can also use a predefined threshold in your scene if the threshold is appropriate.

yc-song commented 1 month ago

@canghongjian I also have some follow-up questions about this issue. Thank you for every response to issues in this repo.

  1. Is the retrieval performance in Table 3 of the paper based on the code w/ predefined hop?
  2. (If the answer for 1 is 'No') Do u have any experimental results with a pre-defined threshold?
  3. (If the answer for 2 is 'No') Do u have any rule-of-thumb about the range of a pre-defined threshold based on your experience?
canghongjian commented 4 weeks ago

@yc-song All the performance reported in the paper is based on a predefined hop, which is the same as the code.