the distribution next_dist is scaled by the support from the line
next_dist = target_model(next_state).data.cpu() * support
It seems like this should not be the case. This results in the final projected distribution not summing up to one. It seems one should do something like
Hi,
I have a question regarding the
projection_distribution
method. It seems that when you are projecting back on the support/bins, at lines :the distribution
next_dist
is scaled by the support from the linenext_dist = target_model(next_state).data.cpu() * support
It seems like this should not be the case. This results in the final projected distribution not summing up to one. It seems one should do something likeThis results in a distribution that contains the same amount of mass as the original one.
Thank you, Lucas