THUDM / ReST-MCTS

ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)
219 stars 11 forks source link

Environment is missing #1

Open XuweiyiChen opened 4 months ago

XuweiyiChen commented 4 months ago

Would you mind sharing your environment details so we can reproduce the result easily?

zhangdan0602 commented 2 months ago

Thank you for your question. Not sure what environmental details would you like to know? For example, run the MCTS algorithm, train the PRM or train the policy model?

BinBrent commented 2 months ago

Thank you for your question. Not sure what environmental details would you like to know? For example, run the MCTS algorithm, train the PRM or train the policy model?

How about start from release the MCTS inference environment?

shaoyuyoung commented 2 weeks ago

I think he wanted to know the requirements.txt file.

As I also want to know :)

zhangdan0602 commented 1 week ago

We have uploaded the requirements.txt file.