Axelrod-Python / Axelrod

A research tool for the Iterated Prisoner's Dilemma
http://axelrod.readthedocs.org/
Other
725 stars 264 forks source link

Namespace reorganisation #1175

Open drvinceknight opened 6 years ago

drvinceknight commented 6 years ago

Let's use this issue to identify the best strategy for the namespace reorganization mentioned on #1174

marcharper commented 6 years ago

My first inclination is to have axl.GAME.* for each game supported by the library, containing strategy and scoring functions. That cleanly splits IPD and Ultimatum. However there are games like Coordination and Hawk-Dove that are in some sense just variations on IPD. So a variation could be to have axl.ipd.strategies, axl.ultimatum.strategies, etc. and retain the history methods and scoring at the axl.* level, with the understanding that each submodule has to define a canonical scoring function that can be passed to e.g. the Tournament class. We may be able to autodetect a generic scoring function based on the player types into a match or tournament.

That will let us keep Match and Tournament generic but they'll take an optional scoring function as an argument rather than a Game instance. We'd have to make each strategy class an instance of a generic abstract Player class (like we do now for the IPD), so there's some refactoring to be done but I think it's worth it.

This could also allow some generalization / introduction of a History class that can manage multi-player situations, or retention of history throughout a tournament, which opens up the strategy space a bit.

drvinceknight commented 6 years ago

My first inclination is to have axl.GAME. for each game supported by the library, containing strategy and scoring functions. That cleanly splits IPD and Ultimatum. However there are games like Coordination and Hawk-Dove that are in some sense just variations on IPD. So a variation could be to have axl.ipd.strategies, axl.ultimatum.strategies, etc. and retain the history methods and scoring at the axl. level, with the understanding that each submodule has to define a canonical scoring function that can be passed to e.g. the Tournament class. We may be able to autodetect a generic scoring function based on the player types into a match or tournament.

This sounds good to me @marcharper.

Minor point: Hawk Dove is just a variation of the scoring function right (so technically already supported)? In essence the ipd submodule is a "framework" for 2 by 2 games.

That will let us keep Match and Tournament generic but they'll take an optional scoring function as an argument rather than a Game instance. We'd have to make each strategy class an instance of a generic abstract Player class (like we do now for the IPD), so there's some refactoring to be done but I think it's worth it.

I agree in principle, in terms of details: I think it would be nice if the scoring function was a method on the Game class itself? (This could probably be a conversation in itself.)

This could also allow some generalization / introduction of a History class that can manage multi-player situations, or retention of history throughout a tournament, which opens up the strategy space a bit.

Yeah: this would be good. (I think my suggestion on #1174 of changing the match play in tournaments to use dask fulling (requiring separate files for each player pair) might assist with this? The history class could essentially be a dask data frame... (Thinking aloud ahead here, this isn't important right now.)

marcharper commented 6 years ago

Yeah the library already supports HD and Coordination via changing the game matrix, though it's technically not the IPD any more. Also a scoring/utility function also doesn't have to be of the form f(x) = Ax or based on a game matrix, it's just the most common setup. So IMO an arbitrary function acting on history (and maybe other parameters like the population mix) is the most flexible form for scoring, rather than using a Game class that may not make contextual sense in some cases.

Using dask more seems fine to me. We'll need to decide if a player can be allowed to use all its history (including with multiple opponents / past opponents) and add a new classifier dimension in that case. We could also consider only passing a copy of a players history to the strategy/play methods rather than a full instance of the opponent.

drvinceknight commented 6 years ago

Also a scoring/utility function also doesn't have to be of the form f(x) = Ax or based on a game matrix, it's just the most common setup. So IMO an arbitrary function acting on history (and maybe other parameters like the population mix) is the most flexible form for scoring, rather than using a Game class that may not make contextual sense in some cases.

Yup I completely realize the scoring function is not of the form Ax: it's not technically like that in the library at the moment right? The common mathematical definition of a game is a mapping from strategy space (which is not restricted to a continuous space) to the Reels so having the definition follow through in the library would be nice.

For the IPD the game is not just the RPST values but the RPST values and the mapping from actions to Reels (which is what the Game class currently is).

I believe we're essentially both talking about the same thing which is currently the score method in the game class (https://github.com/Axelrod-Python/Axelrod/blob/master/axelrod/game.py#L29). For the ultimatum game for example I'd imagine a axelrod.ultimatum.game which would contain the score.

Using dask more seems fine to me. We'll need to decide if a player can be allowed to use all its history (including with multiple opponents / past opponents) and add a new classifier dimension in that case.

:+1:

We could also consider only passing a copy of a players history to the strategy/play methods rather than a full instance of the opponent.

You also mentioned only passing the current action and letting the player keep count. Happy to think about all these options: refactoring all the strategies is going to be a big job though...

marcharper commented 6 years ago

sounds good to me

gaffney2010 commented 5 years ago

I think we should try to make some of our abstract player classes work for multiple games. Players like HMMPlayer, FSMPlayer, SequencePlayer, LookerUp, and Gambler. But these will need to be reworked for different types of actions.

gaffney2010 commented 5 years ago

each submodule has to define a canonical scoring function that can be passed

The Action class is going to be different for each submodule too. I think each player should know what game they're playing both in how it's scored and what it's possible actions are. I guess right now, the scoring function (wrapped by game) is saved in the match, and passed into the player's match_attributes at runtime.

I understand that we'd want to try the same players with different scoring function, but it seems funny to me that a player doesn't have a game. You could have Tit-For-Tat play Rock Paper Scissors and hope that that the scoring function throws an error. Or what if there was a different game where D was one of the actions; would TFT work there? I wonder if it's a good practice to derive Player, Match, Tournament, to AxlPlayer, AxlMatch, AxlTournament, and strongly type all of our functions; so that AxlMatch can only be played between AxlPlayers. Do this for each type of game. Then we could throw the IPD-specific Action class into the AxlPlayer, and have each AxlMatch take a specific AxlScoringFunction.

and maybe other parameters like the population mix

We can't use that right now under 4.0, right?

gaffney2010 commented 5 years ago

Unrelated, I'm a little bit worried about how we name some of the classes. Game will end up being just a scoring function, and it may cause confusion with the word we use to distinguish IPD from the Ultimatum game.

Similarly, Action is a little bit misleading. For the Ultimatum game, i.e., what we're calling an Action, for the receiver of the ultimatum, would actually be a range of values that she would accept. In that case it almost feels like a strategy is being returned.

marcharper commented 5 years ago

I started on something like this a while back, see this branch for the ultimatum game. An action in that case is a real number between 0 and 1 inclusive (the proposed split).

Ideally we'd have a generic set of abstract base classes that could run most of the operations with need for knowledge of the details. For example, to run a tournament or a match I'd hope the game itself doesn't matter, e.g. a bracket-style tournament class just needs to know which player won. I think a lot of functionality of the Player class would be similar (e.g. history tracking could be generic). But some of the current functionality, like tracking the number of cooperations, would be IPD specific.

For Rock Paper Scissors, the actions would be R, P, S (from an enum) instead of C, D (from a different enum), so we should get errors from the RPS score function if we try to play it with TFT, as you say.

gaffney2010 commented 5 years ago

For Rock Paper Scissors, the actions would be R, P, S (from an enum) instead of C, D (from a different enum), so we should get errors from the RPS score function if we try to play it with TFT, as you say.

It wouldn't though, right? Look at the code for TFT:

    def strategy(self, opponent: Player) -> Action:
        """This is the actual strategy"""
        # First move
        if not self.history:
            return C
        # React to the opponent's last move
        if opponent.history[-1] == D:
            return D
        return C

If the opponent was an instance of Player, designed for RPS called AlwaysPlayRock, then opponent.history[-1] would be R. The if statement would fail, and TFT would return C. Where does the error get thrown?

I think if instead of "def strategy(self, opponent: Player) -> Action" we had "def strategy(self, opponent: IpdPlayer) -> IpdAction", it'd be cleaner.

marcharper commented 5 years ago

I would hope that one can't compare values of different enums, but if so then we'll need to enforce somewhere that the actions returned by a strategy are valid actions for that strategy. I agree that the type signatures should follow the same rules but they aren't enforced at runtime.