Closed TomLin closed 5 years ago
Hi!
Yes, you're right, those indexations are now needed for code to work properly, but I'm not sure why it worked fine before :). Probably some subtle change in numpy indexation. I'll implement and commit the change you've proposed.
Thanks!
Sorry for looong wait. Fixed by PR13: https://github.com/PacktPublishing/Deep-Reinforcement-Learning-Hands-On/pull/13
https://github.com/PacktPublishing/Deep-Reinforcement-Learning-Hands-On/blob/a307a952bb914a1b5a43b7c92b7237ca28d88d28/Chapter14/06_train_d4pg.py#L79-L86
Hello Maxim,
Your book is really helpful for us to immediately implement D4PG, but I am wondering if the index for l and u should be modified to be as follows:
proj_distr[eq_dones, l
[eq_mask]
] = 1.0proj_distr[ne_dones, l
[ne_mask]
] = (u - b_j)[ne_mask] proj_distr[ne_dones, u[ne_mask]
] = (b_j - l)[ne_mask]Thanks for any clarification.