Closed chasemcd closed 4 years ago
The basic social influence experiment (number 1) has not been implemented in this repository. Only Experiment III: Modeling Other Agents is present, next to the baseline A3C (and PPO) model.
Right, thanks for the clarification.
As I understand it, in the initial experiments in the paper only a limited number of agents are trained with the MOA/causal influence reward. In the implementation (
train_moa.py
), it looks all agents are equipped with the MOA model and receive a causal influence reward. It isn't immediately clear to me how to alter this to allow for variation in agent policies/models, since the Trainers postprocess and incorporate the causal rewards. Does anyone have any insight or suggestions into how this might be done?