carykh / PrisonersDilemmaTournament

Watch This Place's awesome video about iterated Prisoner's Dilemma for context! https://www.youtube.com/watch?v=BOvAbjfJ0x0
MIT License
205 stars 160 forks source link

Agents can have priori knowledge of expected game length #11

Open pdcook opened 3 years ago

pdcook commented 3 years ago

Emphasis on expected.

Since the game length is defined to be int(200-40*np.log(random.random())), which has an expected value of 240, agents can make use of this in undesirable (?) ways. I.e. use a probabilistic approach to try to deceive the opponent at the end of the game with no consequences.

I think this is an unintended issue since

peterHoburg commented 3 years ago

It also has a max value of 1765 due to random.random() returning a float with limited decimals.

random.random() also uses a deterministic number generator and a single instance of the seed class, so you can actually figure out the number of games based off the calls to the underlying class.

But that would fall under the cheating and get your code kicked out?

nobody5050 commented 3 years ago

Even if it’s not going to get your code kicked out it’s still a pretty scummy thing to try and predict the number of games via the random function.

peterHoburg commented 3 years ago

Even if it’s not going to get your code kicked out it’s still a pretty scummy thing to try and predict the number of games via the random function.

100%. It defeats the entire point.

That being said, it is really fun to try and figure out how to break things. Just don't do it.

astropingo commented 3 years ago

I think it is fair to use probability to try to optimize for the expected value. I don't think it is fair to use information on the library implementation to have a magical knowledge (that is, know the output after getting the random seed somehow) and use that as an advantage.

seba-eng commented 3 years ago

He mentioned the rule "don't cheat the system" in his video (here so I guess we are expected not to use there informations. But it would still be interesting to see the benefit of a strategy using this.

Bip901 commented 3 years ago

Another problem with this game length approach is that if random.random() returns 0 (which is possible), the program crashes as log(0) is undefined.

carykh commented 3 years ago

Hey guys! Thanks for the feedback. I have a feeling that there's almost no way to prevent some sort of "end-of-game" strategizing. Aside from having games go on forever, is there any way to prevent agents from knowing expected game length at all? (I suppose I could just change the length during the official tournament run, but not tell people to which length.)

In Axelrod's 1980 experiment, he just ended the game at 200 turns, which is the easiest to strategize because you could just defect on the 200th turn, consequence-free. But the only participants were "well-known game theorists", so I'm guessing they all trusted each other not to do that.

As for gaming the output of random.random(), @mvpetri is right, that shouldn't be allowed. I think I can fix that by just changing the seed to something unknown before the final run.

TimStanglow commented 3 years ago

End of Game Strategizing is already impossible, because the hard limit of 1765 will never actually be reached. I just ran int(200-40*np.log(random.random())) 10 Million times, and the maximum ever reached was 852. The only real knowledge that is relevant is that past 200 Turns, the expected number of turns remaining is 40. But it will stay 40 at every turn after than and never decrease until at least 1500+, but that will never occour it's just too unlikly.

nobody5050 commented 3 years ago

End of Game Strategizing is already impossible, because the hard limit of 1765 will never actually be reached. I just ran int(200-40*np.log(random.random())) 10 Million times, and the maximum ever reached was 852. The only real knowledge that is relevant is that past 200 Turns, the expected number of turns remaining is 40. But it will stay 40 at every turn after than and never decrease until at least 1500+, but that will never occour it's just too unlikly.

Sure it’s statistically impossible, but it doesn’t hurt to add a quiet little check to your script which will always deflect on that round. Probably won’t run but if you got super lucky it would really affect your score

duckboycool commented 3 years ago

End of Game Strategizing is already impossible, because the hard limit of 1765 will never actually be reached. I just ran int(200-40*np.log(random.random())) 10 Million times, and the maximum ever reached was 852. The only real knowledge that is relevant is that past 200 Turns, the expected number of turns remaining is 40. But it will stay 40 at every turn after than and never decrease until at least 1500+, but that will never occour it's just too unlikly.

Sure it’s statistically impossible, but it doesn’t hurt to add a quiet little check to your script which will always deflect on that round. Probably won’t run but if you got super lucky it would really affect your score

I'm assuming you meant to say that it's unlikely but could still have a good impact on score, but that's really not true. I tried running it a billion times (with numba, but I think the random results are the same) and got a max of 1102. It would probably take a huge amount of runs in order to get it to max once, and the potential +2 net points once against one opponent really won't change anything, especially since the final results will be ran with multiple iterations apparently.

nobody5050 commented 3 years ago

Yeah, I understand that. What I was meaning to say is that it doesn’t hurt to add it, even if it’s astronomically unlikely, it’s still technically a point gain

TimStanglow commented 3 years ago

I don't think you understand just how astronomically unlikely it is. At that point the energy cost to run that extra line of code of probably a trillion timesa trillion times greater than the expected value of money you will win will increase by.

duckboycool commented 3 years ago

At that point the energy cost to run that extra line of code of probably a trillion timesa trillion times greater than the expected value of money you will win will increase by.

Just for fun, the random.random function returns multiples of 2-53, so I think there is a 1 in 253 chance of returning the minimum nonzero value (which I got as giving a round length of 1669 actually, but it should be pretty close either way). I found that it takes about 70 ns to do a comparison in python, so we'll say 50ns. If your CPU runs at 10W, this would make the amount of power used for the calculation by the CPU about 5 10-7 J, or 1.39 10-13 kWh. At a low energy cost of 5 cents per kWh, this makes your final cost 6.94 10-13 cents per run, and to run it for a round of 200 turns and about 10 opponents, it will be 1.39 10-9 cents for running your strategy once.

Using 253, if we say there's 1500 other entrants, and that having the max round length only once against any entrant is enough to bring you from no prize to the first place $1000, then the odds it brings you to first should be about 1.66 * 10-13, and your final expected gain is 1.6 * 10-8 cents.

So according to this, it would be marginally worthwhile to do the check if you only were testing your strategy once, but the assumptions were fairly kind for the check, so it's probably much lower in actuality.

Also I think the round length change to use 1 - random will make the possible values different anyway.

This was definitely worth doing.

nobody5050 commented 3 years ago

This was definitely worth doing.

Uselessly calculated math is the best math!

colonelwatch commented 3 years ago

There's no such thing, really, as "expected game length" in this tournament. The distribution of possible game lengths is a log curve going from 200 to infinity. Finding the average value of that kind of curve is meaningless.

In fact, I ran a Monte-Carlo simulation for the odds of the game ending at a given turn. After turn 200, it goes from 0 to 2.5% and stays there. This is exactly as carykh says: a low but equal probability that the game will end.

Figure_1

The samples get too sparse after 500 rounds give or take, but the odds of reaching 500 is 0.056% anyway. If anyone is curious or wants to check me (please correct me if you see something), here's the code:

import numpy as np
from matplotlib import pyplot as plt

# Sample all possibilities
lengths = np.random.rand(10000000)
lengths = 200-40*np.log(lengths)
lengths = lengths.astype(int) # convert to integers, since this is a round count
lengths_temp = lengths.copy() # working copy

# Progressively calculate odds of game ending on this turn then lop off possiblity curve
game_end_odds = []
for i in range(600):
    if(len(lengths_temp) != 0):
        game_end_odds.append(len(lengths_temp[lengths_temp == i])/len(lengths_temp))
        lengths_temp = lengths_temp[lengths_temp > i]
    else:
        game_end_odds.append(0)

# Convert odds to percent chance
game_end_odds = np.array(game_end_odds)
game_end_odds *= 100

plt.plot(game_end_odds)
plt.xlabel('Rounds')
plt.ylabel('Chance of Game Ending (%)')
plt.title('Chance of Game Ending on Given Round')
plt.show()

print("Chance of reaching 500 rounds:", 100*len(lengths[lengths >= 500])/len(lengths), "%")
colonelwatch commented 3 years ago

Sorry, I realized the first paragraph didn't logically follow from the rest of my comment. I figured out that the integral of possibilities converges at 240, so therefore 240 is the expected value. This does follow from the rest, since another round after 200 gets increasingly unlikely.