Open macterra opened 6 years ago
The definition in the paper describes the characterization between friends and adversaries based on the concept of a "reactive environement", which
can be thought of as possessing privileged information about the agent’s private strategy, acquired e.g. through past experience or through “spying”.
Or more formally,
Alternatively, another way of detecting whether the environment is reactive is by estimating the mutual information I(π; z|x) between the agent’s strategy parameter π and the environment’s action z given the player’s action. This is because, for a non-reactive environment, the agent’s action x forms a Markov blanket for the environment’s response z and hence I(π; z|x) = 0; whereas if I(π; z|x) > 0, then it must be that the environment can “spy” on the agent’s private policy.
The characterization then becomes something along the lines of:
We could then maybe extend this concept to reactive agents.
Based on Modeling Friends and Foes