-
**Describe the bug**
I'm getting weird obs back. Sometimes I'm getting only 1 TerminalSteps und 0 DecisionSteps, sometimes 2 TS and 0 DS and the even worse part, I'm getting sometimes 12 (when having…
-
Hello WarpDrive Team,
A good MARL library indeed. I have tried this library on an old machine and it works fine.
However, when I moved to a new machine, I met the following error.
```
(warp_…
-
I want to explore policy gradient and actor critic agents on `GridWorld` environments. To that end, I want to parameterize the policy as a Categorical distribution at each state. How do I do this?
…
-
Update the `OvercookedState` and `OvercookedGridworld` classes to include recipe lists that can change with time. This would involve adding timestep dependencies to `all_orders` and `bonus_orders`
-
I'm trying to implement a simple REINFORCE agent on `Gridworld`. However, I keep hitting the following error:
```
File "/home/rylan/Documents/GanguliGang-Metacognitive-Actor-Critic/mac_venv/lib/…
-
-
I'm using the code for my own env which has no time limit but has max episode step limit.
and my best action maybe about -0.5. But I found that my action would beyound the limitation [-1,1] a lot. I…
-
I think the idea of environment scheduling is very novel. Multi-environment and multi-agent are scheduled on GPU, which improves GPU utilization ratio.
I have some questions about the `tag-continuous…
ghost updated
2 years ago
-
Hi, I found something weird in the controls in GridWorld. It seems like up and down are inverted:
I used the first cells of the Google Colab tutorial in Google Colab:
```Python
from IPython import …
-
Is it possible to create an agent which uses different policies depending on an observation? For example, in a hypothetical Windy Gridworld environment where the wind can change direction spontaneousl…