-
Add the A2C algorithm which is the synchronous version of the algorithm described in this paper https://arxiv.org/pdf/1602.01783.pdf
and described here: https://medium.com/emergent-future/simple-rei…
-
Implement and explore the effectiveness of actor critic agent.
-
I am implementing Soft-Actor Critic (SAC) agent and need to evaluate q-value network inside my custom environment (for the implementation of a special algorithm, called Wolpertinger's algorithm, to ha…
-
I am unable to obtain the result as reported in the paper ‘Soft Actor-Critic Algorithms and Applications ’ on the openai environment Humanoid-v2. The result is 6000 while the original paper is 8000, …
-
When I tried to run ./train_mpe_spread.sh, I met the following issue:
```
obs_space: [Box(18,), Box(18,), Box(18,)]
share_obs_space: [Box(54,), Box(54,), Box(54,)]
act_space: [Discrete(5), Disc…
-
Nice to have would be a continuous value-based RL (best would probably be Neuro-fitted Q Iteration) as well as a life-long policy-gradient algorithm (e.g. Natural Actor Critic). Maybe some dynamic pro…
-
Is RNN support available for TD3 and SAC algorithms? On the website of Tianshou there is a table that says that RNNs are not supported for both TD3 and SAC, however, there are functions RecurrentCriti…
-
Support continuous action space for selecting real hyperparameters within the bounds specified by algorithm space config:
- https://medium.com/@asteinbach/actor-critic-using-deep-rl-continuous-mounta…
-
I let [sim2real.py](https://github.com/Zhehui-Huang/quad-swarm-rl/blob/master/swarm_rl/sim2real/sim2real.py) create the c code for network evaluation, however I am a bit confused about the calculation…
sAz-G updated
4 months ago
-
Thanks for reply, I have been busy at another project last few days, recently I get spare time.
I have noticed that at comm_net, the variables of communication part(maybe along with encoder part) a…