I want to implement the training process of the multiagent-competition, I am wondering whether move_reward_weight is the annealing factor in the formula 1 in the paper and how can I make it anneal to 0 in the specified timestamps during the training, thank you!
Hi,
I want to implement the training process of the multiagent-competition, I am wondering whether move_reward_weight is the annealing factor in the formula 1 in the paper and how can I make it anneal to 0 in the specified timestamps during the training, thank you!