-
Hello.
I am running your code in the atari game Breakout-v0.
the settings are simple DQN(nips), DQN(nature), DDQN, dueling DQN, dueling DDQN.
now, each processes are running almost 6M(6,000,000)…
-
line 56
` advantage = tl.layers.ElementwiseLambda(lambda x,y: x-y)([avalue,mean]) #a - avg(a)`
the variable "advantage" is not used, anything wrong?
-
Hey @MikeInnes, if you are back could you please review the code? New models which I have added are Dueling DQN, Advantage Actor-Critic, and DDPG. Also, all the previous work done on DQN is added to d…
-
Hello, I would like to ask where your code uses D3QN. I was confused when I saw TD3 in the train.py code, and would like to ask you what happened?Additionally, I am also confused about the use of duel…
-
老师好,6.3 节对决网络(Dueling Network)好像没有对 Dueling DQN “为什么要把 Q 值函数拆开” 的解释,所以我最开始看完了这一节后依然有点困惑,所以希望可以加一些这部分的解释。(当然如果是我遗漏了这一部分的话,那非常抱歉 😂)
我目前对 Dueling DQN 的粗浅的理解是,它拆 Q 值函数是为了把状态和动作分开考虑,从而能够判断 Q 值高到底是因为状态好所…
-
Current design is the most basic architecture for deep RL. Followings are some improvements which can be made for Q-learning.
- [x] Experience Replay
- [x] Usage of 'Targent Network' (See deepmind…
-
このリポジトリの手法を移植していきたい。
https://github.com/MotoShin/dqn-tutorial
-
你好,最近在閱讀強化學習相關的論文,偶然發現您的Code,最近在研究中
以下是我遇到的一些問題
1.所謂的DQN是指用CNN來預測Q值 那這樣DQN跟CNN有甚麼差別呢? loss function的不同嗎?
2.如何更改資料集 ? 例如ft06 改la 09
3.程式碼中預設dueling為F 代表使用ddqn模型嗎?
4.前幾次疊代會出現這樣的原因是甚麼?
![imag…
-
Hello Hardworking Contributors,
Is there is a way for me to see the robot in action in the real virtual experimental environment after training it on Dueling DQN using StableBaselines3 ? I understand…
-
def dueling_dqn(input_shape, action_size, learning_rate):
...
state_value = Lambda(lambda s: K.expand_dims(s[:, 0], dim=-1), output_shape=(action_size,))(state_value)
...