Closed xiaolizh1 closed 6 months ago
Sorry to bother you, may I ask how to set B1 in Step level Beam Search, or change step_beam_width in sbs_sft.yaml? Setting step_beam_width doesn't seem to have much effect.
Yes. B1 is step_beam_width. We observed that B1 from 1 to 3 has significant effect, but if continue to increase e.g. 4 or 5, there is incremental improvement.
Sorry to bother you, may I ask how to set B1 in Step level Beam Search, or change step_beam_width in sbs_sft.yaml? Setting step_beam_width doesn't seem to have much effect.
Yes. B1 is step_beam_width. We observed that B1 from 1 to 3 has significant effect, but if continue to increase e.g. 4 or 5, there is incremental improvement.
Another question, where can I download the data used to train the value model in the paper?
Sorry to bother you, may I ask how to set B1 in Step level Beam Search, or change step_beam_width in sbs_sft.yaml? Setting step_beam_width doesn't seem to have much effect.
Yes. B1 is step_beam_width. We observed that B1 from 1 to 3 has significant effect, but if continue to increase e.g. 4 or 5, there is incremental improvement.
Another question, where can I download the data used to train the value model in the paper?
Our work didn't have annotated training data. It was automatically generated, and you can refer to training data generation in readme. You can run the corresponding script to generate MCTS data for each round.
Particularly, for each question, we generate 4 to 10 trees, and sampled at most 4 correct solution paths and 4 incorrect solution paths.
Sorry to bother you, may I ask how to set B1 in Step level Beam Search, or change step_beam_width in sbs_sft.yaml? Setting step_beam_width doesn't seem to have much effect.