DHDev0 / Stochastic-muzero

Pytorch Implementation of Stochastic MuZero for gym environment. This algorithm is capable of supporting a wide range of action and observation spaces, including both discrete and continuous variations.
GNU General Public License v3.0
53 stars 10 forks source link

What about merging with SpeedyZero code base? #6

Closed GrigoryEvko closed 1 year ago

GrigoryEvko commented 1 year ago

Hi! Thanks for implementing that Deepmind paper. What do you think about merging with some highly optimized distributed implementation of the MuZero family member (SpeedyZero). It's essentially an EfficientZero, but on juice (from the same authors). With c++ cpu and partially gpu MCTS module, efficient buffer and many more. I think, it's more relevant to muzero-unplugged implementation, but isn't the stochastic version not much different? I think I am going to implement some changes from SpeedyZero into your implementation on my own, but if you are interested, you can create a new branch and we will work on it there together. Code for the SpeedyZero in the Supplementary materials.

Thanks for the code again!

GrigoryEvko commented 1 year ago

P.S. https://github.com/DHDev0/Stochastic-time-series-forecast-simulator Ah, I see what you're up to. Me too. IMHO, external predictor is required, RL on its own wouldn't be enough. Can share something in DM or email.

DHDev0 commented 1 year ago

I would be glad to do it. I was waiting for pytorch 2.1 to accelerate the the training and inference with compile and quantization. Perhaps it's better to create a new repo for it. Discord channel so you will be able to dm: https://discord.gg/KR4NyrPX Also invite you to an other repo.

DHDev0 commented 1 year ago

Will continue on an other repo. Discord is open if anyone else want to participate.

DHDev0 commented 1 year ago

Cancelled