codelion / optillm

Optimizing inference proxy for LLMs
Apache License 2.0
1.6k stars 128 forks source link

[Question]: Which paper is mcts.py based on? #51

Closed RomanKoshkin closed 1 month ago

RomanKoshkin commented 1 month ago

Could you tell me which paper(s) is this code https://github.com/codelion/optillm/blob/main/optillm/mcts.py based on? Thanks!

codelion commented 1 month ago

It is not a direct implementation of a paper but I took the inspiration from Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning. The technique in that paper is for training and not inference but in this library I implemented it during inference.

It is also the case with Prover-Verifier Games improve legibility of LLM outputs, that is also a technique for training but in this library it is implemented during inference.