Open zafarmah92 opened 5 years ago
Using the normalized reward (#6 ) with the other agent's, taking the example of A2C where the discounted rewards are used on the extrinsic reward.
Using the normalized reward (#6 ) with the other agent's, taking the example of A2C where the discounted rewards are used on the extrinsic reward.