Chapter 7: What is the use of eq_mask?

Hi!

1)Am I correct in assuming we are trying to get index of values which fall directly on the atom(l == u) and have their dones set to True?

Yes, you're right. During projection, we calculate positions of shifted atoms on old support. If we fall somewhere between old supports, we need to distribute weights, but if some projections landed exactly at old supports, we don't need to distribute anything, just copy weight to those positions.

Piece of code you've quoted correspond to branch where we handle batch cases with done=True. For those samples, probability distribution is just 1 at support which correspond to obtained reward.

2) Can you also please explain line 176(the Boolean tensor taking a mask of itself is slightly confusing)?

This expression is a bit silly way to get True only where eq_maks == True && dones == True Probably, it could be written better, but it already took me some time to obtain the result I expected :).

To visualize the effect of this distribution projection function, you can check my adhoc tool: Chapter07/adhoc/distr_test.py I propose to step through it in debugger to check states of all masks and resulting distributions.

PacktPublishing / Deep-Reinforcement-Learning-Hands-On

Chapter 7: What is the use of eq_mask? #21