TonghanWang / ROMA

Codes accompanying the paper "ROMA: Multi-Agent Reinforcement Learning with Emergent Roles" (ICML 2020 https://arxiv.org/abs/2003.08039)
Apache License 2.0
149 stars 34 forks source link

Where does the number 13.9 in Dis Loss come from? #7

Closed AdagioZbc closed 4 years ago

AdagioZbc commented 4 years ago

We are curious about the number 13.9 in the computation of the Dis Loss. Could you please tell us what does it mean?


In latent_ce_dis_rnn_agent.py:

mi = th.clamp(gaussian_embed.log_prob(latent_move.view(self.bs * self.n_agents, -1))+13.9, min=-13.9).sum(dim=1,keepdim=True) / self.latent_dim  

TonghanWang commented 4 years ago

Hi,

-13.9 ≈ log(1e-6)

Since the output of the log_prob function can be arbitrarily small, we add this number to guarantee numeric stability.

AdagioZbc commented 4 years ago

Thanks a lot! Your reply has solved our issue.

TonghanWang commented 4 years ago

Happy to see it solved. If you have any other questions, please feel free to contact us.