-
Hi Patrick,
for my next YT video I want to showcase Eqx for RL and so I use it to train an agent (using policy gradient algorithm) on some gym environments.
The agent solves the environment eas…
-
1e2e4bcb4761ba6107fb565982f6cc2b951cbeb5 introduced version qualifier `numpy < 1.23` due to failing tests with jax, but I believe this has been resolved with recent jax versions and we no longer need …
-
Hi,
I noticed that the PopArt layer is applied on the value head of the models in Meltingpot v2.0. I was able to implement it on IMPALA successfully, but when applying it to OPRE, there seems to be…
-
[loss of a2c rl_losses.py](https://github.com/deepmind/open_spiel/blob/c3f8b538afd6223d450c0f74269937e76850cf33/open_spiel/python/algorithms/losses/rl_losses.py#L196)
I think the total loss should be…
-
I followed the instruction of the installation. However, some bad things happened. This is a amd computer, window system and use Python 3.9.13.
pip install dm-acme[tensorflow] works well. But pip in…
-
I get this error when trying to import rlax, I think because tree_multimap has been deprecated?
```ImportError: cannot import name 'tree_multimap' from 'jax.tree_util' (/home/rohanmehta/anaconda3/…
-
In reinforcement learning, target network is a common technique to assist off-policy value learning. In PyTorch-based implementations, `target_q_network = deepcopy(q_network)` could create a target ne…
-
微博内容精选
-
Could we create a new subpackage (or find somewhere to put this) for native loss functions written in JAX? The issue is that when I try to use loss functions from say `sk_metrics`, the function will n…
-
Hello,
I can't PR this for you because the docs are not in this repo so I'll just open this issue.
At https://rlax.readthedocs.io/en/latest/api.html#id1, the section MPO Compute Weights and Temper…