Add run_rollout.py and run_rollout_no_widen.py. Which run variants of mcts with rollouts instead of using the verifier on partial programs. The no widen version makes a fixed number of children for each node (like traditional MCTS), the other one uses progressive widening.
I changed the depth penalty to be -1, which makes more sense theoretically, and empirically does not change performance.
I also slightly refactored the way that the token limit is handled to make it compatible with the rollout methods.
This makes two changes:
I also slightly refactored the way that the token limit is handled to make it compatible with the rollout methods.