-
How to deal this? In SAC training.
-
Hello,
First of all, thank you for providing the DDPG+HER code; it has been a great help. However, I have some basic questions as I am just starting to learn about reinforcement learning. After ada…
-
Hi, I used your code and trained a decent agent but it doesn't brake, I am now trying to implement stochastic brake.
I was wondering do i need to uncomment both line 94-99 and line 105-112 in ddpg.py…
-
i can not find this function
-
Question to the author -- were you able to successfully learn policies to control the agents? I've been messing around with OpenAI Baselines hooked up to your environment. Using DDPG, so far I haven…
-
When executing the following cell:
`df_summary = ensemble_agent.run_ensemble_strategy(A2C_model_kwargs,
PPO_model_kwargs,
…
-
I have tried to load the trained agent with these lines
`from stable_baselines3 import SAC`
`agent = SAC.load("BipedalWalker-v3.zip")`
Where of course the file "BipedalWalker-v3.zip" comes from…
-
在simple_tag环境中有3个adversary agents和一个good agent。
你的good agent好像是random运动的。
我觉得需要把ddpg的算法赋给good agent,相当于
3个predator和一个prey在同一个环境中学习,predator学习包夹策略,prey学习逃跑策略。
原论文在simple_tag上就是我说的实验方法,虽然这样做环境和学习都会变…
-
When building an ActorDistributionNetwork with bounded array_specs, the network occasionally produces actions that violate the bounds. This seems to be a result of the line `scale_distribution=False` …
-
### Student
- Nikola Simić RA 32/2020
### Asistent
- Filip Volarić
### Problem koji se rešava
- Cilj agenta je da se pozicionira na parking mesto za najkraći vremenski period. Na putu d…