NVlabs / GA3C

Hybrid CPU/GPU implementation of the A3C algorithm for deep reinforcement learning.
BSD 3-Clause "New" or "Revised" License
652 stars 195 forks source link

Unnecessary relu applied to action probability logits #17

Closed msokoloff1 closed 7 years ago

msokoloff1 commented 7 years ago

In the NetworkVP.py file on line 93, the activation function should be explicitly set to None. As it currently stands, the logits are being put through a relu non-linearity before sotfmax is applied.