Closed Sophiakkk closed 1 week ago
Hi, @Sophiakkk, thanks for your interest in this work. Agent's actions have the following correspondence:
Regarding the issue, logs indeed have been written in wrong order: action names for mediator should be switched. Now, as changes to tabular-games/env/log.py
have been applied, it should output correct policy info.
Hi Dmitry, I hope this message finds you well!
I've been reading your paper, Mediated Multi-Agent Reinforcement Learning. The idea is fascinating, and I'm currently attempting to reproduce the results using the code provided on GitHub. However, I've encountered a discrepancy between the logs for the mediator and the agents. Specifically, the action mapping seems inconsistent: for the environmental agents, action 0 is "defect," 1 is "cooperate," and 2 is "commit." Meanwhile, for the mediator, action 0 is "cooperate," and 1 is "defect," which appears to be opposite to the agents.
In the controller.py file, I noticed that the mediator's moves are set to correspond to the environmental agents' moves when they choose to commit, as indicated by actions_to_env[i] = actions_mediator[i].
Could there be an issue with the logging, or is there a misunderstanding on my part? I would appreciate your clarification on this matter.
Thank you very much!