-
Thanks for offering this wonderful code. But I have a question.
1. Why in the combination part of the equation, the advantage A need to subtract it's average? I've already refer to the paper but sti…
-
I changed the parameter in examples/dqn.py to this and I get an error:
```
def main():
env_name = 'CartPole-v1'
# env_name = 'PongNoFrameskip-v4'
use_prioritization = True
use_…
jt70 updated
3 months ago
-
Hello, I am new in this field, and my question is: where I can find your structure of your Neural Network in your code about the algorithm Dueling DQN?
Thanks very much!!
-
-
Hi If i run the code for breakout, i am getting the following error.
Traceback (most recent call last):
File "main.py", line 120, in
main()
File "main.py", line 117, in main
atari_…
-
**Important Note: We do not do technical support, nor consulting** and don't answer personal questions per email.
Please post your question on the [RL Discord](https://discord.com/invite/xhfNqQv), [R…
-
当使用demo_DQN_Dueling_Double_DQN 训练结束的的pt文件无法作为测试时的权重文件 ,是否需要将保存pt文件
由torch.save(actor, actor_path)
更改为torch.save(actor.state_dict(), actor_path)
-
From my understanding the target network updates are implemented wrong in the notebook Double-Dueling-DQN.ipynb.
As it updates the same step as the main network (every 4th). In this simple environmen…
-
在介绍Dueling DQN的部分,描述到”在同一个状态下,所有动作的优势值之和为 0,因为所有动作的动作价值的期望就是这个状态的状态价值。“,我的理解是所有动作的优势值在策略 pi 下的期望为0,而不是之和为0?不知道我的理解有没有问题。
-
{
"base_config": "configs/HighwayEnv/agents/DQNAgent/ddqn.json",
"model": {
"type": "EgoAttentionNetwork",
"embedding_layer": {
"type": "MultiLayerPerceptron",…