Where does the number 13.9 in Dis Loss come from?

TonghanWang / ROMA

Codes accompanying the paper "ROMA: Multi-Agent Reinforcement Learning with Emergent Roles" (ICML 2020 https://arxiv.org/abs/2003.08039)

Apache License 2.0

149 stars 34 forks source link

Closed AdagioZbc closed 4 years ago

AdagioZbc commented 4 years ago

We are curious about the number 13.9 in the computation of the Dis Loss. Could you please tell us what does it mean?

In latent_ce_dis_rnn_agent.py:

mi = th.clamp(gaussian_embed.log_prob(latent_move.view(self.bs * self.n_agents, -1))+13.9, min=-13.9).sum(dim=1,keepdim=True) / self.latent_dim

TonghanWang commented 4 years ago

Hi,

-13.9 ≈ log(1e-6)

Since the output of the log_prob function can be arbitrarily small, we add this number to guarantee numeric stability.

AdagioZbc commented 4 years ago

Thanks a lot! Your reply has solved our issue.

TonghanWang commented 4 years ago

Happy to see it solved. If you have any other questions, please feel free to contact us.