Closed HenrySprueill closed 9 months ago
For a given tree, the path from the root node to the node with the highest reward is the reasoning pathway of the model for. given query. The generation of the traces is out of order and skips reasoning steps.
For a given tree, the path from the root node to the node with the highest reward is the reasoning pathway of the model for. given query. The generation of the traces is out of order and skips reasoning steps.