Open happy1234happy opened 2 weeks ago
Hi, yes the target tau is 1. In other words it is copied, because the RL is trained at the end of the episode. I did extensive experiments as seen in chapter 2 of my thesis and found that training at every episode is prone to overfitting while training at the end of episode generalizes.
Tau =0.01 is good when training at every step but I found tau= 1 works better for training at the end of episode. This is selected bssed on experiments in the chapter 2 of my thesis. I didn't run with tau= 0.01 for this CCMMADRL.
On Tue, 12 Nov 2024, 09:36 happy1234happy, @.***> wrote:
Hello, author. I noticed that in line 15 of CCM_MADRL.py, you set target_tau=1. I would like to know the reason behind this setting. In other works, I usually see that the soft update rate targettau is set to a value smaller than 1, such as 0.001. In your code, at line 124: t.data.copy((1.
- self.target_tau) t.data + self.target_tau s.data), if targettau=1, this would result in t.data.copy(0 t.data + 1 s.data), which means the parameters of the target network are directly copied from the source network, and they are fully synchronized every time. In this case, would this still be considered a soft update?
— Reply to this email directly, view it on GitHub https://github.com/TesfayZ/CCM_MADRL_MEC/issues/9, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANZZ277E5HTYCCXJKR4O6BD2AHDZTAVCNFSM6AAAAABRTTNXTCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGY2TCNRRGA4TGNI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thank you very much for your response. If target_tau = 1, the parameters of the target network are directly and fully copied from the source network at each step, meaning the target network's parameters will always be identical to those of the source network. In this case, does the target network lose its purpose? Since its structure and parameters are always the same as those of the source network, it essentially becomes equivalent to using the source network. I’m not sure if my understanding is correct, and I would appreciate your answer. Thank you.
Hello, author. I noticed that in line 15 of CCM_MADRL.py, you set target_tau=1. I would like to know the reason behind this setting. In other works, I usually see that the soft update rate targettau is set to a value smaller than 1, such as 0.001. In your code, at line 124: t.data.copy((1. - self.target_tau) t.data + self.target_tau s.data), if targettau=1, this would result in t.data.copy(0 t.data + 1 s.data), which means the parameters of the target network are directly copied from the source network, and they are fully synchronized every time. In this case, would this still be considered a soft update?