In worker.py, we need to sample experience by using local model and then backpropagate the gradients of total loss. The total loss is composed of loss_of_actor, loss_of_critic and entropy.
In line 132, you have already multiply 0.5 to value_loss_vb, but in line 152 you to that again. Does it mean the factor of critic loss in total loss is 0.25? Is it what you want to do?
In worker.py, we need to sample experience by using local model and then backpropagate the gradients of total loss. The total loss is composed of loss_of_actor, loss_of_critic and entropy.
In line 132, you have already multiply 0.5 to value_loss_vb, but in line 152 you to that again. Does it mean the factor of critic loss in total loss is 0.25? Is it what you want to do?
Thank you. :)