-
I am using the MCTS algorithm and need to replicate a env object at current step . And then run the copies of the env object separately.
I have tried copy.deepcopy() and pickle.dumps(), but it got…
-
I compared the sampled muzero code with the muzero general but I didn't find the code about the number of samples and the policy improvement, can you tell me what changes you have made?
-
### Summary of issue
The training process gets killed by the kernel. There is a log in `dmesg` stating that the reason is "out of memory".
**Model**: MuZero with self-supervision
**Environment**:…
-
In the codebase, there are training and evaluation scripts. This is great. But, I lack an inference script here, in which I can run the existing weights on the environment and see how it performs visu…
-
## Problem Description
[Muesli](https://arxiv.org/abs/2104.06159) is a next-generation policy gradient algorithm from DeepMind that performs exceptionally well. Notably, it can match MuZero’s SOTA …
-
微博内容精选
-
First of all, I want to thank the developers for this awesome project! It's simple, clean yet powerful. I really enjoyed playing with it.
I'm currently studying at the University of Alberta under t…
-
Hi Daniel,
I'm trying to run a custom environment (works with muzero) with your Stochastic-muzero version.
After creating a config file (just changing env name in experiment_450_config.json) I'm…
-
I found Openspiel super useful in my research. I am wondering if the API is friendly to any of the popular multiagent RL frameworks(e.g. rllib, stablebaselines3, tianshou) so that we can use different…
-
Hi, recently upgraded to a Ryzen 7950 (16 cores/32 threads, PCI-E 5) and converted to SF2. The Nvidia 1050 peaks at ~75% (using NVIDIA-SMI) while the cpu's are not really working hard. It used to be t…