Simplify dense reward calculation if reward_only_positive is true

In this part of the dense reward calculation method, the damage and deaths of ally units are accumulated into variables delta_ally and delta_deathsand used to compose the reward later. Notice how dealth_deaths is only changed if self.reward_only_positive is false:

https://github.com/oxwhirl/smac/blob/a185b7082dc5a12debdec8a344cf5177a7f67fff/smac/env/starcraft2/starcraft2.py#L684-L701

When the reward is calculated using the previous accumulated values, delta_ally is only used if self.reward_only_positive is false. The version of delta_deaths that is altered in the ally loop above is also only used if self.reward_only_positive is false.

https://github.com/oxwhirl/smac/blob/a185b7082dc5a12debdec8a344cf5177a7f67fff/smac/env/starcraft2/starcraft2.py#L716-L719

This makes me conclude that we only need to process ally units in this method if self.reward_only_positive is false, otherwise we can ignore the first loop. I don't know how much this would affect performance (this is a method that runs on every game step, after all) but I could come up with this simplified version. I'd just like others to validate if what I said is true.

oxwhirl / smac

Simplify dense reward calculation if reward_only_positive is true #54