RL Implementation - Githubissues

jieyilong / tree-of-thought-puzzle-solver

The Tree of Thoughts (ToT) framework for solving complex reasoning tasks using LLMs

https://arxiv.org/pdf/2305.08291.pdf

MIT License

287 stars 35 forks source link

RL Implementation #3

Open andreasbinder opened 1 year ago

andreasbinder commented 1 year ago

Hi, thank you for the paper and the interesting concept!

We want to build on your idea, using RL methods. However, in your code I did not find the policy implementations. I assume they are supposed to be here.

It would be great if you can share your code so that I can experiment also on my own :) Keep up the good work!

Sheerkay commented 4 weeks ago

In your paper, you mentioned using an improved version of the REINFORCE algorithm [32] to directly train the ToT (Tree of Thought) controller and the prompt agent. However, in the code of your GitHub project, I did not find the corresponding reinforcement learning method. It seems that your strategy still relies on natural language prompts.