Open zroug opened 4 years ago
@summer-yue Can you PTAL at this?
Could you please elaborate? I see that the l2_regularization_loss
includes the regularization losses and was added to total_loss. The coefficients are default to 0, but if you wish to use regularization losses, you should be able to change the value_function_l2_reg
and policy_l2_reg
.
I mean regularization that was added to the network with the add_loss
function. For example activity regularization and other more advanced regularization can be done with this function.
The losses can be obtained with the member losses
of the Network
class. DqnAgent
for example uses them instead of implementing its own regularization:
Oh I see. Thanks for explaining. So you are pointing at the inconsistency in the implementation between PPO and other agents in terms of where losses are calculated, not that PPO doesn't do regularization.
We are checking with the team. There might be some historical reasons for this.
Yes, exactly.
I formulated it that way because I had a network that had its regularization defined during the creation of the network via said function and they weren't applied, which is a bit surprising. Regularization that is defined in the PPOAgent
constructor is applied just fine.
A drawback of the current implementation is that it can only do l1 and l2 regularization and that it can't be restricted to a specific set of weights for example.
Thanks for checking with the team!
Sorry for the delayed response. We are not sure why this inconsistency exists, probably be some historical reasons. We could look into this once we get some bandwidth. Also feel free to submit a PR and contribute yourself :)
This is simply oversight. I think we need to add an agent validator and ensure agents properly use layer losses - or raise an exception if they don't support separate losses.
@oars fyi, can you include this in the validator bug?
PPOAgent
does not use regularization losses that are defined in thevalue_net
andactor_net
. It should add them to the other losses.