Closed 844015539 closed 4 years ago
hey @844015539 ,
thank you for your feedback! First, munchausen is a new concept that recently was published. I wrote a medium article about it: Article or you can check out the original Paper. when you start a normal experiment just let munchausen set at 0.
IQN-DQN.ipynb is just a simple notebook for the base algorithm. you probably don't need it but if you want to open it you need to open jupyter notebook
Hello,Dittert I'm sorry to disturb you .I wanna ask you some questions about your code on "agent". And my doubts are mainly result in "def learn_per(self, experiences):" 1.Q_targets_next = Q_targets_next.gather(2, action_indx.unsqueeze(-1).expand(self.BATCH_SIZE, self.N, 1)).transpose(1,2) Q_expected = Q_expected.gather(2, actions.unsqueeze(-1).expand(self.BATCH_SIZE, self.N, 1)) I want to ask "action_indx", is it wrong? I didn't see your any defination about it in the former code. Moreover, in the " Q_expected ",it turns to be "actions". 2."assert td_error.shape == (BATCH_SIZE, self.N, self.N), "wrong td error shape" " ,Why does "self.N" appear 2 times? I am looking foraward to your reply.
OH,I am sorry ,it is my fault . I am too careless to see action_indx.
Hello,Dittert I want to know what is "self.layer_size = layer_size" mean? Does it mean the size of convolution layer's output? I see some code in 《deep reinforcement learning hands on》,Is 'layer_size ' means 'conv_out_size'? I think the following defination will be more prudent and clear. I hope you can answer my question.Thank you!
self.conv = nn.Sequential(
nn.Conv2d(input_shape[0], 32, kernel_size=8, stride=4),
nn.ReLU(),
nn.Conv2d(32, 64, kernel_size=4, stride=2),
nn.ReLU(),
nn.Conv2d(64, 64, kernel_size=3, stride=1),
nn.ReLU()
)
conv_out_size = self._get_conv_out(input_shape)
self.fc_adv = nn.Sequential(
nn.Linear(conv_out_size, 256),
nn.ReLU(),
nn.Linear(256, n_actions)
)
self.fc_val = nn.Sequential(
nn.Linear(conv_out_size, 256),
nn.ReLU(),
nn.Linear(256, 1)
)
def _get_conv_out(self, shape):
o = self.conv(torch.zeros(1, *shape))
return int(np.prod(o.size()))
hey @844015539,
layer_size is the number of neurons in a linear layer. In the readme you can read about it as well... :
Size of the hidden layer, default=512
Hello,Dittert I'm sorry to disturb you .I wanna ask you some questions about your code. 1.what is the meaning of "Munchausen RL", I search the net but I can't find something about it. IF I just start a normal experment with noisy_iqn_per, shall I choose "if not self.munchausen:" ? 2.I can't open your text "IQN-DQN.ipynb",is something wrong with it? Ich mag Deutschland so sehr . Aber mein Deutschniveau ist nicht gut. Vielen Dank! Mit freundlichen Grüßen Ihr Lin Yuan