wzy6642 / ProCo

Official implementation for "Large Language Models Can Self-Correct with Key Condition Verification" (EMNLP 2024)
https://wzy6642.github.io/proco.github.io/
Apache License 2.0
3 stars 1 forks source link

Citation #1

Open li-aolong opened 2 days ago

li-aolong commented 2 days ago

Previous work: Large Language Models are Better Reasoners with Self-Verification (Weng et al., 2023) Arxiv EMNLP 2023

The core method of ProCo, substitute verification, is identical to the method in Large Language Models are Better Reasoners with Self-Verification by Weng et al. (2023).

The main difference lies in the approach Weng et al. (2023) use to generate the initial answer, where multiple answers are generated and the original question is masked for verification, eventually leading to one final answer. In contrast, ProCo generates one answer at a time, iterating multiple times to arrive at the final answer. Additionally, for the question masking method, both ProCo and Weng et al. (2023) use two types. Weng et al. (2023) refer to them as True-False Item Verification and Condition Mask Verification, and ProCo’s classification is essentially the same.

Weng et al. (2023): image

ProCo: image image

Questions:

  1. Have you seen this paper?
  2. If you have, why was it not cited?
  3. How would you evaluate the high similarity between the core methods of ProCo and Weng et al. (2023)?
wzy6642 commented 2 days ago

The primary distinction between our work and Self-Verification lies in the following aspects:

  1. Self-Verification employs the Best-of-N decoding method, which instructs LLMs to generate multiple solutions, score each using a scoring function, and select the highest-scoring solution as the final answer. However, the Best-of-N method fails when the correct answer is not included in the generated sample set. In contrast, ProCo utilizes an iterative verify-then-correct framework that progressively identifies and corrects (probably) erroneous responses to achieve the correct one. This approach avoids the repetition of previous mistakes and incrementally enhances response quality.

  2. Self-Verification randomly selects conditions to mask during the process. By contrast, ProCo introduces a key condition identification method, which improves the accuracy of the substitute verification process.

Notably, the differences between Self-Verification and ProCo are substantial, as reflected in their methodologies and effectiveness. Additionally, I have not read this paper.