Desired New strategies - Githubissues

marcharper commented 9 years ago

Some of these may be implemented under other names already, please ask if you are unsure! Feel free to add any new ones to the list. Note that we are happy to have original contributions as well!

Binary decision strategies defined in "Varying Decision Inputs in Prisoner’s Dilemma", Barlow and Ashlock 2015
Function stack based strategies from "Ashlock, Daniel. "Training function stacks to play the iterated prisoner's dilemma." Computational Intelligence and Games, 2006 IEEE Symposium on. IEEE, 2006."
Pavlovian, Identifier strategies, Grudgian from n-Move Memory Evolutionarily Stable Strategies for the Iterated Prisoner’s Dilemma

The "invincible strategies" in this paper which can all be implemented as special cases of the MemoryOne or LRPlayer classes.

The two "most abundant" memory one and memory two strategies in this paper.

Adaptor from Simple Adaptive Strategy Wins the Prisoner’s Dilemma second_pdf

Specific strategies evolved in Evolutionary game theory using agent-based methods such as GCA.

Strategy MO and Strategy SO from this paper

Strategies implemented in PRISON (look in classics.str):

soft_spiteful
~~slow_tft~~
~~better_and_better~~
~~worse_and_worse2~~, ~~worse_and_worse3~~

and see this paper

spiteful_cc
~~winner12~~ ~~winner 21~~
~~mem2~~
~~gradual_killer [Already done on another name?]~~
soft_tf2t [TF2T?]
and many others such as the 12 ZD strategies
Done: ~~c_then_per_dc~~, ~~doubler~~, ~~easy_go~~, ~~gradual~~, ~~per_ddc~~, ~~per_cccdcd~~, ~~prober4~~, ~~tft_spiteful~~, ~~worse_and_worse~~

From CoopSim:

~~ContriteTFT~~
TwoTitsForTwoTats -- and the generalization to NTitsForMTats
Others that you find interesting

Many strategies in this paper are not yet in the library:

From "Exploiting Evolutionary Modeling to Prevail in Iterated Prisoner’s Dilemma Tournaments":

Laran
Turan
Tages

From this page (see also the bibliography) for the 20th anniversary tournament:

~~Soft Grudger~~
~~Adaptive Tit For Tat~~
PavlovD: http://www.cs.nott.ac.uk/~pszjl/index_files/chapter4.pdf
StarSN, StarS, StarN, mem1, PoorD, ltft, MooD, and others see here and here
AITFT, GSTFT, Adept, emperor, PRobbary, HCO, etc. [See here, will require some sleuthing]http://www.cs.nott.ac.uk/~pszjl/index_files/IPDbook_chap01.pdf)
~~Adaptive~~
~~APavlov~~ : http://www.graham-kendall.com/papers/lhk2011.pdf
~~NEG: "NEG plays according to simple rules: if opponent plays COOPERATION, in next move NEG will play DEFECTION; if opponent plays DEFECTION NEG will play COOPERATION. First move will be random."~~
~~Omega Tit For Tat~~ (see also here)

From here:

Free Rider
Rover

From this paper and also here:

~~adaptive tft~~
~~contrite tft~~
~~handshake~~ ~~fortress3~~ ~~fortress4~~ ~~firm but fair~~ ~~gradual~~ ~~naive prober~~ ~~remorseful prober~~ ~~reverse pavlov~~ ~~soft grudger~~

Any of the interesting finite state machine strategies in the papers with fortress (and other papers authored by Wendy Ashlock and Daniel Ashlock, and collaborators)

E.g. from the 2015 paper "Multiple Opponent Optimization of Prisoner’s Dilemma Playing Agents" including the unnamed sugar strategies and treasure hunt strategies in figures 2 and 3
~~Solution B1~~ and ~~Solution B5~~ Also from "Fingerprint Analysis of the Noisy Prisoner's Dilemma Using a Finite-State Representation"
vengeful, PSY, PSY-TFT, TFT-PSY, UD, UC

Many from this paper. Note the several are already in the library, including ~~ALLC, ALLD, TFT, WSLS, willing, hopeless, and desperate~~ (and possibly others).

From these two papers:

From this page:

forgiving
nasty TFT (randomly plays DD)

From the mythical tournament preliminary to Axelrod #1:

Analogy
Look Up / Look Ahead (different from LookerUp in the library)

From this publication:

~~Gradual~~
~~Adaptive tit-for-tat~~

From this paper:

Lenient Grim 3
Exp. TFT
False Cooperator
TF3T
Exp Grim 2
Lenient Grim 2
Exp TF3T
T2

From this paper:

~~shortmem~~
~~selfsteem~~
Boxer
~~VeryBad~~
ANN Agents
GADP1
GADP2
BM
MC
~~Stalker~~

From this library (if the license is compatible):

cautious
copycat
craby
forgetful
golden
Hardy
Mean
Mensa
Moron
Observant
Unforgiving
Waffely
killer

Others:

Opponent Modeller see also
~~DBS, DesiredBeliefStrategy ref~~
From the 20th anniversary tournament book | slides with some info Book
MaRS: Mimicry and Relative Similarity

No-tricks Strategies described here

Theory of mind strategies discussed here.

Would be neat to have strategies based on:

cellular automata / ~~finite state machines~~ e.g.
bandit algorithms
the memory-based strategies described here
Markov chain Monte Carlo
Neural networks See this paper for examples
"Particle Swarm Optimization Approaches to Coevolve Strategies for the Iterated Prisoner’s Dilemma"
Tree based strategies from "Crossover and Evolutionary Stability in the Prisoner’s Dilemma"

Translate Fortran strategies available in https://github.com/Axelrod-Python/axelrod-fortan to python.

souravsingh commented 7 years ago

@drvinceknight I am looking to implement the Eugine_Nier Avenger strategy from the link-http://lesswrong.com/lw/7f2/prisoners_dilemma_tournament_results/

Do we have to create a separate script or put it in TitforTat?

drvinceknight commented 7 years ago

F (-, Eugine_Nier): Standard Tit-for-Tat with the following modifications: 1) Always defect on the last move. 2) Once the other player defects 5 times, switch to all defect.>

I think that sits nicely in the titfortat file :+1:

0101010001010111 commented 7 years ago

Are the FORTRAN strategies from Axelrod Tournament 2 still open?

drvinceknight commented 7 years ago

Are the FORTRAN strategies from Axelrod Tournament 2 still open?

There are indeed, if you find a particular one you want to tackle probably always worth double checking http://axelrod.readthedocs.io/en/latest/reference/all_strategies.html that it has not been implemented already or ask here :)

0101010001010111 commented 7 years ago

@drvinceknight Are there naming conventions for the strategies? I see that K61R is named Champion, K46R is named Eatherley, but K76R is named Tester.

Should I default to naming the strategies after their authors?
- How about the case of authors having the same last name? K47R and K48R are by Richard Hufford and George Hufford respectively.
- How about a singular strategy that has multiple authors? K60R is by Jim Graaskamp and Ken Katzen.
Or should I name them simply according to the Fortran source file (e.g., Champion would have been named K61R instead)?

drvinceknight commented 7 years ago

@0101010001010111 (awesome handle btw): I think the more descriptive the name the better but feel free to make a judgement call (and it can always be discussed on the PR).

The main reason we went for author names for the first of Axelrod's tournament was because there were no other names to go for.

Note that we try to include all relevant names (in the case of strategies being called different things in different sources) in the docstrings: you can see examples of this here: http://axelrod.readthedocs.io/en/latest/reference/all_strategies.html#axelrod.strategies.grudger.Grudger

MariosZoulias commented 7 years ago

Hey i have 2 questions .

1) I am working on the new strategy and i try to implement it into to axelrod _first.py as you said so i make the class "stein_and_rapoport" but when i try to create a match stein_and_rapoport vs Alternator it says that there is no stein_and_rapoport strategy in axelrod. So how do i insert my strategy into axelrod files (i followed the steps in docs) .

2)In stein and rapoports it says that every 15 turns the players does a chisquared test. Question A) why does it run a chi-squared test ? I mean how does it change the way of the player's behavior (D or C) ??? Question B)For example when we are in 15th turn we use the whole history but when we are in 30 turn (eg) do we use the whole history (1-30) or the last 15 (15-30)?. I suppose that for chi-squared tests the more the data the merrier.

Thanks a lot Marios

drvinceknight commented 7 years ago

I am working on the new strategy and i try to implement it into to axelrod _first.py as you said so i make the class "stein_and_rapoport" but when i try to create a match stein_and_rapoport vs Alternator it says that there is no stein_and_rapoport strategy in axelrod. So how do i insert my strategy into axelrod files (i followed the steps in docs) .

It sounds like you're not quite following all the steps (but it's difficult to guess without seeing your code). From: http://axelrod.readthedocs.io/en/latest/tutorials/contributing/strategy/adding_the_new_strategy.html

2)In stein and rapoports it says that every 15 turns the players does a chisquared test. Question A) why does it run a chi-squared test ? I mean how does it change the way of the player's behavior (D or C) ???

Here's what the description says: "Every 15 moves it makes use of a chi-squared test to check if the opponent is playing randomly."

So you need to do a chi squared test on the distribution count of cooperations and defections to see if that's statistically significantly random. (So whether or not it differs from player 50/50).

Question B)For example when we are in 15th turn we use the whole history but when we are in 30 turn (eg) do we use the whole history (1-30) or the last 15 (15-30)?. I suppose that for chi-squared tests the more the data the merrier.

Use the whole history.

MariosZoulias commented 7 years ago

Thank you for your answer. But i still have a question . Lets assume that the game is Stein_and_Rapoport vs Random ... So the Stein_and_Rapoport player will understand that the random one plays randomly . So how does this fact (that the opponent playes randomly or not) changes the move of Stein_and_Rapoport. Do we search for it just theoritically (just to know if the opponent plays random) or practically (e.g. if he plays random --> we always defect, if not --> we play tit for tat) ???

Thank you

marcharper commented 7 years ago

Hi @MariosZoulias -- the strategy isn't well-described but I assume that the Chi-squared test is used to determine if the opponent is playing randomly by some level of confidence, and if so, defect against it.

MariosZoulias commented 7 years ago

Thanks a lot for the answer. I also have two more question (actually i need your advice if possible). Working on the chi-squared test , 1) in order to understand if the opponent behaves randomly do i have to take into account ,his next moves after C and D of my player (stein_and_rapoport) and then check the chis-squared ? Or it is even simpler ? 2) Strategy Random is random (ok). But i think strategy Alternator Cooperator and Defector (eg) are also random because they behave in the same way all the time . Also i believe TitForTat is not random strategy (player behaves differently according to the moves of the other player). A i right on my thinking ??

Thank you

drvinceknight commented 7 years ago

in order to understand if the opponent behaves randomly do i have to take into account ,his next moves after C and D of my player (stein_and_rapoport) and then check the chis-squared ? Or it is even simpler ?

I believe it's a straight forward chi squared test based on the two numbers: the number of cooperations and the number of defections. A chi squared test checks those counts and infers (from the total number of counts) whether or not this is a random distribution.

Strategy Random is random (ok). But i think strategy Alternator Cooperator and Defector (eg) are also random because they behave in the same way all the time . Also i believe TitForTat is not random strategy (player behaves differently according to the moves of the other player). A i right on my thinking ??

No Alternator, Cooperater and Defector are not random. If you were playing against Cooperator the count of cooperations after 40 turns would be 40 cooperations and 0 defections. That would be statistically different to 20 cooperations and 20 defections as would be indicated by the chi squared test.

If you were playing the Random player, perhaps after 4 turns you would have 4 cooperations and 0 defetcions and (perhaps) the chi squared test would say that that is statistically different to random behaviour however after 40 rounds maybe the count would be 24 and 16 which the chi squared test would say is random.

Here is the chi-squared test in scipy: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chisquare.html

>>> from scipy.stats import chisquare
>>> test = chisquare([4, 0])
>>> test.pvalue
0.04550026389635857
>>> test = chisquare([26, 16])
>>> test.pvalue
0.12282264810139186
>>> test = chisquare([40, 40])
>>> test.pvalue
1.0
>>> test = chisquare([101, 99])
>>> test.pvalue
0.88753708398171505

Here is a plot of test.pvalue for chisquare([100 - n, n]): ie looking at all possible number of counts of cooperations and defections after 100 turns:

>>> import matplotlib.pyplot as plt
>>> ns = range(1, 101)
>>> ps = [chisquare([100 - n, n]).pvalue for n in ns]
>>> plt.plot(ns, ps)

download 10

That's showing that the pvalue is high around the time where we're near to a 50/50 split.

What a chi squared test is doing is checking if the distribution given (the count of defections and cooperations) is statistically significant to the random distribution (50/50 split). This is done by comparing the pvalue to some significance level. So if pvalue < alpha then you would say that the distribution is significantly different to the random distribution. So, if pvalue >= alpha then the opponent is playing randomly. Often a value of alpha=0.05 is used in the literature but that's just an arbitrary choice so we would need to make a choice for the strategy (I assume none can be found in the literature) and that can also be a parameter of the strategy.

drvinceknight commented 7 years ago

Note that axl.Player has a cooperations and defections attribute that counts these things already. So using the chi squared test with the library will be straight forward:

>>> from scipy.stats import chisquare
>>> import axelrod as axl
>>> axl.seed(0)
>>> players = (axl.Cooperator(), axl.Random())
>>> match = axl.Match(players, turns=200)
>>> _ = match.play()
>>> players[0].cooperations, players[0].defections
(200, 0)
>>> chisquare([players[0].cooperations, players[0].defections]).pvalue
2.0884875837625688e-45
>>> players[1].cooperations, players[1].defections
(93, 107)
>>> chisquare([players[1].cooperations, players[1].defections]).pvalue
0.32219880616257868

MariosZoulias commented 7 years ago

Thank you for the analysis . The only thing that is note clear in my mind is that : If i do

chisquare([players[1].cooperations, players[1].defections]).pvalue and i have players = (axl.Random(), axl.TitForTat()) The pvalue of TitForTat is gonna be a big number (like 0.65) which means that we have to say the opponent (titfortat) plays randomly . Which does not exist because he doesnt play randomly but according to titfortat strategy . Also like titfortat for alternator (which is 50/50) the scipy is gonna give a high number . So again we will receive it like a random one . So in both we fail because titfortat and alternator are not random .

What do i think wrong here?

drvinceknight commented 7 years ago

You're not doing anything wrong, I think you're just pointing out a weakness of the strategy. From how it is described I think the only thing you can do is test the distribution of C and D as I have written. Because of the way the strategy in question plays it would in fact recognise that Tit for Tat is not random.

drvinceknight commented 7 years ago

Some simple ZD ones to implement from the literature. From @marcharper on #1041:

I have found some other concrete ZD examples in case we want to add more examples from the literature:

~~(11/13, 1/2, 7/26, 0) from Press and Dyson ZDmischief (0.8, 0.6, 0.1, 0) an ZDextortion (0.64, 0.18, 0.28, 0) from this paper: https://arxiv.org/pdf/1308.2576.pdf~~ ~~There's a memory-two generalization in this paper on page 21~~, as well as the memory-one (15/16, 1/2, 7/16, 1/16): http://math.uchicago.edu/~may/REU2014/REUPapers/Li,Siwei.pdf

Looks like maybe one or two more in this paper PZDR (1.0, 0.35, 0.75, 0.1) (but looks like donation game matrix): https://pdfs.semanticscholar.org/824a/2123e1de5aa2e971fa9b1bf167b8ff246aa5.pdf Some in this paper, see the caption for Fig 3: http://web.evolbio.mpg.de/~hilbe/Research_files/Hilbe%20et%20al%20(GEB%202015)%20Partners%20or%20rivals.pdf

drvinceknight commented 7 years ago

Have edited the above list with a pointer to the fortran strategies.

souravsingh commented 7 years ago

@drvinceknight The link to Mensa strategy shows a license which says that- "Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer."

Should we be working on adding the strategies to the library, considering the license is incompatible?

meatballs commented 7 years ago

The licence applies to the source code itself - not to the idea which is captured by that code. We would be in breach of the licence if we took the code and incorporated into our library, but we are perfectly ok to take code the strategy ourselves.

JCodyA commented 3 years ago

Hello all! What a cool project, I just discovered this a few days ago. I'd love to contribute some new strategies. Has anyone implemented a Perlin style random strategy?

marcharper commented 3 years ago

Hi @JCodyA, you can check the list of references in the documentation to see if there's a matching source. If you are still unsure, please post a source and I should be able to tell if there's already a matching strategy in the library.

JCodyA commented 3 years ago

@marcharper I had a look at docs/reference/all_strategies.rst and the only strategy listed that was similar was the rand.py strategy, but not a perlin one. I'm thinking of two possible variations of a perlin strategy: a perlin cooperator and a perlin defector. One will cooperate on a semi random basic similar to natural randomness (ie raindrops), and the other will defect on a semi random basis.

marcharper commented 3 years ago

@JCodyA I don't think there's anything quite like that -- I presume you mean that a player will say defect with some distribution other than a Bernoulli. There are several strategies that behave like say TFT and then randomly defector or otherwise act randomly or noisily, but that doesn't sound like quite the same thing. See TrickyCooperator and RandomTitForTat for examples.

Axelrod-Python / Axelrod

Desired New strategies #379