carykh / PrisonersDilemmaTournament

Watch This Place's awesome video about iterated Prisoner's Dilemma for context! https://www.youtube.com/watch?v=BOvAbjfJ0x0
MIT License
205 stars 159 forks source link

Will ALL duplicates be removed, including random? #82

Open nekiwo opened 3 years ago

nekiwo commented 3 years ago

I'm curious if ALL duplicates will be removed. How close does a duplicate have to be to for its remove? Are countless randoms be removed as well?

NuppeNoo commented 3 years ago

I've been operating under the assumption that multiple randoms are allowed, since cary said something like "you have a non-zero chance of winning if you submit random". The way duplicates are judged could easily change the winner. Should probably have asked this myself like 5 days ago.

duckboycool commented 3 years ago

I'm pretty sure cary ended up deciding to allow duplicates.

Q: What if people submit exactly the same strategy? A: They'll all still be entered into the roster separately.

EFHIII commented 3 years ago

That's new

redtachyon2098 commented 3 years ago

(EDIT: I only saw the first two posts when I typed this.)I really don't like the idea of removing duplicates, although I do get that it could be the only possible choice. What if you've submitted a strategy that does horribly against itself, but you don't care because duplicates are guaranteed gone? Or what if you've submitted the best strategy ever, but you didn't get anything because it got removed for being a duplicate?(Unlikely, but still possible!) I don't think Cary would shell out thousands of dollars to everyone who've submitted duplicates of the top 3 strategies.(Again, it is unlikely, but possible.)

EFHIII commented 3 years ago

Plot twist: one of the random strategy wins so everyone who submitted random wins $1000 and Cary goes bankrupt.

redtachyon2098 commented 3 years ago

That would be so bad....

carykh commented 3 years ago

Yeah, after talking about this with friends and my teacher, I think it makes the most sense to not remove any duplicates. Basically, we'll consider each .py strategy submitted to be its own independent player.

If there's some strategy that is strategically the best, and multiple people submitted it, then one of them would have to end up on top due to the random chance of playing with random opponents. So, no matter what happens, I'll still only award the 1st-place prize money to 1 person. That would still kinda suck though because it would turn this tournament into a pseudo-lottery, but it's better than having to pay arbitrarily many people!

I haven't run the tournament code yet, but based on how diverse the submissions are so far, I doubt "random" (or any other strategy simple enough that multiple people would pick the same thing) will win. But I can't guarantee it yet until I run the code!

redtachyon2098 commented 3 years ago

Indeed. But are you sure you can run a thousand submissions within a week?

carykh commented 3 years ago

Haha, I am actually not sure! The runtime is going to be O(n^2), so with 1,600 submissions, it will actually take... 10,000 times longer than the dummy 16-strat test I have run. And that's not even including the idea of running it 100+ times and averaging the results to smooth out randomness...

I might need to buy computing time from other people, lol

redtachyon2098 commented 3 years ago

I wish you luck. And processing power.

ThatXliner commented 3 years ago

Haha, I am actually not sure! The runtime is going to be O(n^2), so with 1,600 submissions, it will actually take... 10,000 times longer than the dummy 16-strat test I have run. And that's not even including the idea of running it 100+ times and averaging the results to smooth out randomness...

I might need to buy computing time from other people, lol

You might want to use the multi-threaded version from the enjoyer's repo

duckboycool commented 3 years ago

Haha, I am actually not sure! The runtime is going to be O(n^2), so with 1,600 submissions, it will actually take... 10,000 times longer than the dummy 16-strat test I have run. And that's not even including the idea of running it 100+ times and averaging the results to smooth out randomness...

I might need to buy computing time from other people, lol

Have you looked at #37 and the caching mentioned in #17 for possibly speeding up the software side of it as well? If you can make sure that you mark which strategies are random while going through them, you could potentially speed that up a lot. (As well as multiprocessing which you already mentioned looking into.)

carykh commented 3 years ago

Ooh, both of those techniques look really promising! Especially the one about caching non-random strategies. That is interesting that if two strategies are deterministic, you only need to run their tournament once, since it'll be the same every time.

Thanks for the suggestions, everyone!

EFHIII commented 3 years ago

That's not entirely true; even if they're deterministic, the length of the match isn't so there can still be noticeable variation, especially if the match is chaotic. For example, if during a match, it by sheer chance (unlikely, but possible) ends up being 500 steps long, and the two strategies both cooperate for the most from turn 1-200, but get stuck always defecting after that, the result will be very different than had the match only lasted 200 steps.

One thing that I personally do on my branch that I think helps a bit is that if both sides Cooperate the entire time, Then you know that'll (probably) be the case every time. That's the most important case since a lot of strategies are "nice" so a good chunk of the games end that way.

I also multi-thread the runs so that I'm using all of my CPU cores. That alone makes a huge difference.

One optimization I haven't looked into but might be worth considering are: If the two are deterministic, you may not be able to know what the score would average on one run, but you could just sample what the score would be at each of 200-400 runs and take a weighted average based on how likely the game was to end at that many rounds. That makes it go from running the game potentially hundreds of times to once + some math for what's probably a statistically more accurate number.

Also, if you have a lot of duplicates, you could have just one copy and then apply weights as if there were several of them. The main concern there would be if one of them won, you no longer have a way to say which one of the submitters won, but if it reduces the pool significantly, it'd probably be worth it. (I've implemented that in my fork as well as a way to test different densities of certain strategies)

EFHIII commented 3 years ago

Update: I've implemented only running deterministic stuff once. It's about 4 times faster than before on the Prisoners-Dilemma-Enjoyers fork running each strategy 100 times.

carykh commented 3 years ago

Don't worry, EFHII, I am aware of that! I realize the round lengths are random/undeterministic, and the outcome can be very different if the game goes on for 200 rounds versus 1,000 rounds. I knew that, and my internal thought process was randomly choose the ~100 iteration-game-lengths, and then run one head-to-head match for the longest n of those 100, and then you could just truncate that history to get the results for all the shorter iterations of that. So don't worry, whatever I end up doing will still be functionally identical to the correct implementation!

(I just didn't word that all out because I didn't want my comment to get too wordy)

EFHIII commented 3 years ago

The core of my implementation is pre-computing the chance of each number of turns

turnChances = []

def turnChance(x,summing=False):
    if x == 0:
        return 1/40
    if summing:
        S = turnChance(x-1,True)
        return (1-S)/40+S
    return (1-turnChance(x-1,True))/40

for i in range(DETERMINISTIC_TURNS-199):
    turnChances.append(turnChance(i))

normalizing them to sum to 1

# this is so that deterministic algorithms still get 3 points for always Coop, instead of 2.999
chancesSum = sum(turnChances)
turnChances = [i/chancesSum for i in turnChances]

and iterating through each, multiplying by the normalized chances

    # . . .
    totals = [0,0]
    scores = [0,0]

    for turn in range(199):
        scores[0] += pointsArray[history[0,turn]][history[1,turn]]
        scores[1] += pointsArray[history[1,turn]][history[0,turn]]

    for turn in range(199,DETERMINISTIC_TURNS):
        scores[0] += pointsArray[history[0,turn]][history[1,turn]]
        scores[1] += pointsArray[history[1,turn]][history[0,turn]]

        totals[0] += scores[0]/(turn+1)*turnChances[turn-199]
        totals[1] += scores[1]/(turn+1)*turnChances[turn-199]

    return totals, history