This repository contains reinforcement learning training code for the following strategy types:
Model training is by evolutionary algorithms or particle swarm algorithms. There is another repository in the Axelrod project that trains Neural Networks with gradient descent (using tensorflow) that will likely be incorporated here.
You can use this as a library.
Install a development version of the library:
$ python setup.py develop
Run the tests:
$ python -m unittest discover tests
In this repository there are scripts for each strategy type with a similar interface:
looker_evolve.py
pso_evolve.py
ann_evolve.py
fsm_evolve.py
hmm_evolve.py
See below for usage instructions.
In the original iteration the strategies were run against all the default
strategies in the Axelrod library. This is slow and probably not necessary.
By default the training strategies are the short_run_time_strategies
from the
Axelrod library. You may specify any other set of strategies for training.
Basic testing is done by running the trained model against the full set of strategies in various tournaments. Depending on the optimization function testing methods will vary.
LookerUp is based on lookup tables with three parameters:
These are generalizations of deterministic memory-N strategies.
PSOGambler is a stochastic version of LookerUp, trained with a particle swarm algorithm. The resulting strategies are generalizations of memory-N strategies.
EvolvedANN is based on a feed forward neural network with a single hidden layer. Various features are derived from the history of play. The number of nodes in the hidden layer can also be changed.
EvolvedFSM searches over finite state machines with a given number of states.
EvolvedHMM implements a simple hidden markov model based strategy, a stochastic finite state machine.
Note that large values of some parameters will make the strategies prone to overfitting.
There are three objective functions:
objective_score
objective_score_difference
objective_moran_win
Parameters for the objective functions can be specified in the command line arguments for each evolver.
$ python lookup_evolve.py -h
Lookup Table Evolver
Usage:
lookup_evolve.py [-h] [--generations GENERATIONS] [--population POPULATION]
[--mu MUTATION_RATE] [--bottleneck BOTTLENECK] [--processes PROCESSORS]
[--output OUTPUT_FILE] [--objective OBJECTIVE] [--repetitions REPETITIONS]
[--turns TURNS] [--noise NOISE] [--nmoran NMORAN]
[--plays PLAYS] [--op_plays OP_PLAYS] [--op_start_plays OP_START_PLAYS]
Options:
-h --help Show help
--generations GENERATIONS Generations to run the EA [default: 500]
--population POPULATION Starting population size [default: 10]
--mu MUTATION_RATE Mutation rate [default: 0.1]
--bottleneck BOTTLENECK Number of individuals to keep from each generation [default: 5]
--processes PROCESSES Number of processes to use [default: 1]
--output OUTPUT_FILE File to write data to [default: lookup_tables.csv]
--objective OBJECTIVE Objective function [default: score]
--repetitions REPETITIONS Repetitions in objective [default: 100]
--turns TURNS Turns in each match [default: 200]
--noise NOISE Match noise [default: 0.00]
--nmoran NMORAN Moran Population Size, if Moran objective [default: 4]
--plays PLAYS Number of recent plays in the lookup table [default: 2]
--op_plays OP_PLAYS Number of recent plays in the lookup table [default: 2]
--op_start_plays OP_START_PLAYS Number of opponent starting plays in the lookup table [default: 2]
$ python pso_evolve.py -h
Particle Swarm strategy training code.
Original version by Georgios Koutsovoulos @GDKO :
https://gist.github.com/GDKO/60c3d0fd423598f3c4e4
Based on Martin Jones @mojones original LookerUp code
Usage:
pso_evolve.py [-h] [--generations GENERATIONS] [--population POPULATION]
[--processes PROCESSORS] [--output OUTPUT_FILE] [--objective OBJECTIVE]
[--repetitions REPETITIONS] [--turns TURNS] [--noise NOISE]
[--nmoran NMORAN]
[--plays PLAYS] [--op_plays OP_PLAYS] [--op_start_plays OP_START_PLAYS]
Options:
-h --help Show help
--population POPULATION Starting population size [default: 10]
--processes PROCESSES Number of processes to use [default: 1]
--output OUTPUT_FILE File to write data to [default: pso_tables.csv]
--objective OBJECTIVE Objective function [default: score]
--repetitions REPETITIONS Repetitions in objective [default: 100]
--turns TURNS Turns in each match [default: 200]
--noise NOISE Match noise [default: 0.00]
--nmoran NMORAN Moran Population Size, if Moran objective [default: 4]
--plays PLAYS Number of recent plays in the lookup table [default: 2]
--op_plays OP_PLAYS Number of recent plays in the lookup table [default: 2]
--op_start_plays OP_START_PLAYS Number of opponent starting plays in the lookup table [default: 2]
Note that to use the multiprocessor version you'll need to install pyswarm 0.70 directly (pip installs 0.60 which lacks mutiprocessing support).
$ python ann_evolve.py -h
ANN evolver.
Trains ANN strategies with an evolutionary algorithm.
Original version by Martin Jones @mojones:
https://gist.github.com/mojones/b809ba565c93feb8d44becc7b93e37c6
Usage:
ann_evolve.py [-h] [--generations GENERATIONS] [--population POPULATION]
[--mu MUTATION_RATE] [--bottleneck BOTTLENECK] [--processes PROCESSORS]
[--output OUTPUT_FILE] [--objective OBJECTIVE] [--repetitions REPETITIONS]
[--turns TURNS] [--noise NOISE] [--nmoran NMORAN]
[--features FEATURES] [--hidden HIDDEN] [--mu_distance DISTANCE]
Options:
-h --help Show help
--generations GENERATIONS Generations to run the EA [default: 500]
--population POPULATION Starting population size [default: 10]
--mu MUTATION_RATE Mutation rate [default: 0.1]
--bottleneck BOTTLENECK Number of individuals to keep from each generation [default: 5]
--processes PROCESSES Number of processes to use [default: 1]
--output OUTPUT_FILE File to write data to [default: ann_weights.csv]
--objective OBJECTIVE Objective function [default: score]
--repetitions REPETITIONS Repetitions in objective [default: 100]
--turns TURNS Turns in each match [default: 200]
--noise NOISE Match noise [default: 0.00]
--nmoran NMORAN Moran Population Size, if Moran objective [default: 4]
--features FEATURES Number of ANN features [default: 17]
--hidden HIDDEN Number of hidden nodes [default: 10]
--mu_distance DISTANCE Delta max for weights updates [default: 5]
$ python fsm_evolve.py -h
Finite State Machine Evolver
Usage:
fsm_evolve.py [-h] [--generations GENERATIONS] [--population POPULATION]
[--mu MUTATION_RATE] [--bottleneck BOTTLENECK] [--processes PROCESSORS]
[--output OUTPUT_FILE] [--objective OBJECTIVE] [--repetitions REPETITIONS]
[--turns TURNS] [--noise NOISE] [--nmoran NMORAN]
[--states NUM_STATES]
Options:
-h --help Show help
--generations GENERATIONS Generations to run the EA [default: 500]
--population POPULATION Population size [default: 40]
--mu MUTATION_RATE Mutation rate [default: 0.1]
--bottleneck BOTTLENECK Number of individuals to keep from each generation [default: 10]
--processes PROCESSES Number of processes to use [default: 1]
--output OUTPUT_FILE File to write data to [default: fsm_tables.csv]
--objective OBJECTIVE Objective function [default: score]
--repetitions REPETITIONS Repetitions in objective [default: 100]
--turns TURNS Turns in each match [default: 200]
--noise NOISE Match noise [default: 0.00]
--nmoran NMORAN Moran Population Size, if Moran objective [default: 4]
--states NUM_STATES Number of FSM states [default: 8]
$ python hmm_evolve.py -h
Hidden Markov Model Evolver
Usage:
fsm_evolve.py [-h] [--generations GENERATIONS] [--population POPULATION]
[--mu MUTATION_RATE] [--bottleneck BOTTLENECK] [--processes PROCESSORS]
[--output OUTPUT_FILE] [--objective OBJECTIVE] [--repetitions REPETITIONS]
[--turns TURNS] [--noise NOISE] [--nmoran NMORAN]
[--states NUM_STATES]
Options:
-h --help Show help
--generations GENERATIONS Generations to run the EA [default: 500]
--population POPULATION Population size [default: 40]
--mu MUTATION_RATE Mutation rate [default: 0.1]
--bottleneck BOTTLENECK Number of individuals to keep from each generation [default: 10]
--processes PROCESSES Number of processes to use [default: 1]
--output OUTPUT_FILE File to write data to [default: fsm_tables.csv]
--objective OBJECTIVE Objective function [default: score]
--repetitions REPETITIONS Repetitions in objective [default: 100]
--turns TURNS Turns in each match [default: 200]
--noise NOISE Match noise [default: 0.00]
--nmoran NMORAN Moran Population Size, if Moran objective [default: 4]
--states NUM_STATES Number of FSM states [default: 5]