Open hand-burger opened 3 years ago
I've been able to consistently get second and third but never once have I got first
I wouldn't worry about it too much. I doubt that all that many final strategies will be based off of it, and there's not really a strategy you can take against it aside from deciding whether you think it'll be better to attempt to mostly cooperate or investigatively defect and risk having to defect for the rest of the round. That's a tradeoff and you'll have to guess whether it's worthwhile.
If you want grimtrigger's points, you have to forgo taking advantage of alwaysCooperate. That might be worthwhile, if you think there are going to be more grimtriggers than alwaysCooperates in the final pool.
Just realized one of my older tries was affecting the results, which really shows how the actual run will be very different.
As the others have mentioned, it's worth not taking advantage of alwaysCooperate to not get hurt by grimtriggers, assuming a lot more grimtriggers than alwaysCooperate. However, I wonder if waiting towards later rounds (after round 200 let's say) to do testing could be effective. I'm guessing most grimtriggers will be forgiving enough to work with detectives but harsh enough to collect from randoms, delaying the detective work til the end could help reap from cooperates and minimize damage from forgiving grimtriggers/tit-for-tats
Probably hard to optimize without seeing what everyone else has submitted, but might be worth trying after seeing all the results.
What do you mean, a counter to grimTrigger
?
The existence of grimTrigger
(aka the Friedman or Grudge) really defines the meta. You cannot tell it apart from titForTat
, alwaysCooperate
, ftft
, and any other "nice" strategy that never defects first. That is, unless you defect first.
If you defect unprovoked on turn N with M turns remaining, the best you can get is M points in the next M turns (from D/D) from grimTrigger
, so your profit (vs. just cooperating for the rest of that match) is 2-2M. If you defect unprovoked the same turn against alwaysCooperate
, you profit 2+2M, so between the two of alwaysCooperate
and grimTrigger
your unprovoked defection actually net you 4 points, although you might lose 3 or more points of that if you (like the detective
) have any cooperation after your first defection (C/D vs. the grimTrigger
will cost you 1 vs. a D/D, and C/C vs. alwaysCooperate
will cost you 2 vs. a D/C... so if your strategy is clever enough to attempt to resume cooperation at some point, and it will have to be if it wants to avoid the D/D loops with the tit-for-tats, you come out just 1 point ahead from the unprovoked defection against both alwaysCooperate
and grimTrigger
).
At the same time, against most tit-for-tat variants, you may lose out from the unprovoked defection. A defection now gets returned with a defection later, so at best you come out even (D/C nets you +5, D/D gets you +1) but more likely you will either have a surprise defection for which you cooperate (C/D, netting you 0) or a series of D/D's (costing you up to 2M points vs just enjoying the C/C strategy). So the +1 effective point you got from alwaysCooperate
& grimTrigger
probably goes away when you attempt the same strategy with a tit-for-tat opponent, and as you have more tit-for-tat variants in the meta your unprovoked defection will at minimum cost you 1 per tit-for-tat and you get into the red. And that's assuming the best case where your D/C gets a C/D in response and you return to cordial C/C's afterwards; if you wind up having any D/D's, each one costs you 2 points relative to a C/C. A single tit-for-tat will probably be enough to counter your benefit from an unprovoked defection.
All in all, here's my hot take: if your strategy has an unprovoked defection, it will not win. joss
, even detective
variants risk giving up a fair number of points by defecting first. grimTrigger
and alwaysCooperate
may be a wash but you wind up in a tricky position: you can either never back down from the defection pattern (costing you a bunch of points because D/D is worth 2 less than C/C) or you have to risk the 1 point hit from C/D (vs. D/D)- D/C C/D is +5, while C/C C/C is +6. In a meta where your past defections are held against you in any way and your opponents aren't absurdly forgiving (letting you squeeze them for points), you overall lose points by being an unprovoked defector. grimTrigger
just cancels out the freebie points from alwaysCooperate
but even outside that falling into a defection loop with the tit-for-tat variants gets costly fast (especially since tit-for-tat variants are probably going to be well-represented in the submission pool).
From all that, imo this problem reduces to two things once you take the "nice guy" pill and stop messing with unprovoked defections:
Determining how to react to your opponent's unprovoked defection in a way that costs you the least points. (Tit for tat is good at this, it takes the free D/C on the turn your opponent goes back to cooperating and then keeps up cooperation, losing only 1 point to the unprovoked defection in this happy path). I call this a response rule.
Breaking out of D/D loops (or avoiding them entirely). Unless you're an alwaysCooperate
pinata, you are probably going to be defecting at some point if your opponent defects. This can get very costly if you are D/Ding with opponents with whom you can C/C instead. I call this a forgiveness strategy.
You could try doing something fancy that opportunistically defects during the late turns, but remember that the probability of the game ending on that turn is the same for every turn once you hit 200. You don't know what the late turn is. You could be opportunistically mean if the game has gone on longer than usual, but you don't know whether you'll be doing this with a grimTrigger
, alwaysCooperate
, or just another "nice" strategy. The expected value will remain net-negative.
My best-performing strategies actually lose most of their head-to-head matchups. joss
farms them for points. But they have high-scoring losses, and that works out well for them. It's worth keeping in mind that this Iterated Prisoner's Dilemma problem isn't zero-sum. Indeed, just as often is the case in life, it's worth giving others second chances even if they disappoint you from time to time, because the benefits from when they hold up their end of the bargain can more than compensate for the letdowns.
In general a way of thinking about it is to consider the situation where you know your opponent from the start and make the best possible decision vs the outcome of your proposed strategy. In this case that strategy is any one which involves defecting first.
For grimTrigger if you knew you were against grimTrigger you'd always cooperate for +3. If you defect at all though then you're better off always defecting and that maximum becomes +1. So you lose 2 points against grimTrigger for any strategy that defects first.
Against alwaysCooperate you'd ideally always defect for +5. If you never defect against them that becomes +3. So you lose 2 points compared to the ideal against alwaysCooperate for never defecting.
Against TFT it actually doesn't matter that much. One defection will have a very slight effect on your score and the best thing to do is just cooperate for slightly less than +3.
Against FTFT though it might be best if you defect first. TFT strategies played against each other will get stuck in a loop after one defection where they defect every other round netting +2.5 for each player instead of the ideal +3. To stop this FTFT forgives some defection so they can go back to always cooperating. You can abuse this though in some circumstances by taking advantage of FTFT's forgiveness making it possible to do better than +3 if they forgive too often.
Most other strategies in the pool of 9 that are exploitable like random will defect first meaning you don't have to risk defecting, though one special case is detective. Detective is exploitable for the first 4 turns if you defect first then after that it's not.
You have to think about the pool of players you'll be against and consider each strategy on its own while remembering you're not looking to beat your opponent necessarily, you're looking to do better than the average against all opponents. Always defect will score equal to or higher than every opponent it faces, but it will lose because it does worse than the average. grimTrigger will win individual rounds against strategies that defect first, but that doesn't necessarily mean it will score highly overall. It all comes down to what strategies you think other players will be using.
whoa it's really cool how much strategy discussion there is here! I suppose I shouldn't reveal too much about my own intuitions, but I think it's nearly impossible to satisfy GrimTrigger. You'd have to always cooperate, which gives you very little room to toy with your opponent. However, you'd end up in a game filled with mostly Defect-Defects, which sucks for both of you, so I dunno.
I do have one piece of perspective that might help. There are currently about 400 submitted strategies (I haven't checked any of them yet, so maybe some are bots), and there probably aren't that many people who re-submitted GrimTrigger. I guess we'd have to ask ourselves, how many participants will submit forgiving-vs-unforgiving strategies? Which is hard to figure out!
@carykh Yea that's what I was thinking, there is no good counter to Grim Trigger
, but that only matters in the practice experiments. So it'll be very interesting to see the actual run, I doubt anyone will be able to guess the outcome. Cant wait!
If we assume most strategies will be Tit-for-tat variants (which is pretty safe to say), trying to exploit exploitable strategies (like alwaysCooperate or forgiving tft) with your algorithm isn't the best strategy, as you'd get punished more often than not. It's better to aim for 100% cooperate-cooperate games as your baseline.
The real kicker, and what i think will determine the winner is how they deal with more erratic strategies, mainly how accurately they will forgive the opponent. It's beneficial to your total points to try to forgive an usually cooperative opponent that tries to exploit you, but be too forgiving and you will come out behind. Your forgiveness is what will make or break your strategy.
I’d love to see an analysis into what the threshold is in the final meta for naivety, and how that affected individual scores. Actually... @carykh, could you release the submitted strategies and results.txt once the contest is over? It could be really fun to analyze why certain strategies failed.
@aapedro with a slip up against a tit-for-tat variant, you're looking at a constant loss of a few points (1 or 2 if your unprovoked defector is really good). At the same time, if you find an exploit just one ftft
variant in the pool, you're gaining ~200. Contrary to what I said in my earlier comment, I now think unprovoked defectors are viable if they're able to minimize the damage of unprovoked defection and if they're able to maximize the information gain from unprovoked defection to counterplay exploitable opponents against whom defection has a long-run positive net expected value. That's a big "if" for sure, but the outsize rewards of successfully counterplaying even a few exploitable opponents imo can make up for it with a sufficiently good detective- and at some point, someone is going to create one (maybe not this tournament). I have detective variants like this that do really well in my personal, tit-for-tat-dominated meta. They're probably still brittle enough that they'll get destroyed in the actual meta, but the promise is there.
Detection+counterplay could have an impact on performance at about the same size as forgiveness, so the truly optimal strategy will find a way to solve both.
@l4vr0v Detection+counterplay is just the other side of the forgiveness coin, a well-tempered forgiveness algorithm will also avoid being exploited. I'd go in more detail but I don't want to spill the beans too much ;)
@aapedro Yes, but there's a big difference here in that the detective-style detection+counterplay I'm discussing forces an unprovoked defection (which immediately sets you back). Otherwise, you cannot differentiate between, e.g., alwaysCooperates
, grimTrigger
, titForTat
, and ftft
. The forgiveness strategy of de-escalating your way out of D/D loops doesn't quite capture this style of intel-gathering, which has a massive initial cost that you have to then recoup by getting very good at building counter-play patterns. The latter is very non-trivial, and that's why detective
-style strategies have been lagging behind simple tit-for-tat for so long.
So, you just shouldn't defect first?
So, you just shouldn't defect first?
Correct, well for the sample strategies, since that almost guarantees a win for Grim Tripper
or Tit for Tat
.
From the title I thought you were talking about a variant of grimTrigger that I have called "Mercedes".
It's grimTrigger first, but instead of cooperating when not triggered it defects randomly.
@JoelMahon and how does that perform against the sample strats?
@hand-burger No idea, I'm certain it's terrible, just a fun name for a strat that embodies Mercedes driver energy.
I realize that when all the strategies are put together in the end, grim tripper may not be the best, but with the example strats has anyone found a strategy which can consistently counter grim tripper. If so what was your strategy?