Hello, I'm not too familiar with MCTS. Can I analogize it to beamsearch. In the decoding stage, top_5 is selected each time the next token is decoded. After decoding is completed, the path with the largest score is selected according to the formula in your paper.
Hello, I'm not too familiar with MCTS. Can I analogize it to beamsearch. In the decoding stage, top_5 is selected each time the next token is decoded. After decoding is completed, the path with the largest score is selected according to the formula in your paper.