opendilab / LightZero

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)
https://huggingface.co/spaces/OpenDILabCommunity/ZeroPal
Apache License 2.0
1.1k stars 113 forks source link

Custom environment #219

Closed Depresivna-ryza closed 4 months ago

Depresivna-ryza commented 5 months ago

The tutorial for creating a custom environment describes how to add the environment. However, It doesn't mention how to start the RL algorithms on it. Is there some guideline I can follow to test my environment on various algorithms? Also, how can I use AlphaZero, since it assumes a perfect simulator inside its mcts ?

puyuan1996 commented 5 months ago

Hello, thank you for your patience.

Regarding how to test one environment on various algorithms, we have provided a brief instruction on setting up the config file here, hoping to be helpful to you.

Regarding the use of AlphaZero, indeed as you mentioned, it requires the environment to have a perfect simulator. Specifically, like the board game environment, it needs to have the capability to reset(state), you can refer to the implementation of the Gomoku environment. Therefore, if you want to apply AlphaZero to a custom environment, you need to ensure that your environment can implement this function. If the environment does not meet these conditions, you might not be able to use AlphaZero. As an alternative, you could consider using the MuZero algorithm, which conducts MCTS within the learned model. You can refer to the specific details in the MuZero research paper.

If you have other questions, please feel free to contact us. Thank you again for your attention and support!