namin / llm-verified-with-monte-carlo-tree-search

LLM verified with Monte Carlo Tree Search
https://arxiv.org/abs/2402.08147
MIT License
210 stars 25 forks source link

Rollout and depth penalty #64

Closed davidbrandfonbrener closed 2 months ago

davidbrandfonbrener commented 2 months ago

This makes two changes:

  1. Add run_rollout.py and run_rollout_no_widen.py. Which run variants of mcts with rollouts instead of using the verifier on partial programs. The no widen version makes a fixed number of children for each node (like traditional MCTS), the other one uses progressive widening.
  2. I changed the depth penalty to be -1, which makes more sense theoretically, and empirically does not change performance.

I also slightly refactored the way that the token limit is handled to make it compatible with the rollout methods.