sublee / trueskill

An implementation of the TrueSkill rating system for Python
https://trueskill.org/
Other
742 stars 112 forks source link

Does TrueSkill Support playing on both teams. #57

Open lgnashold opened 10 months ago

lgnashold commented 10 months ago

Let's say I have player A and player B. On my games, I have team 1: [A, B] and team 2: [B, B]. Player B is an algorithm, so there's no need to worry about the potential for information leakage between team 1 and team 2, all instances of player B will make decisions independently using the same decision-making process.

Does Trueskill support this case, or is there any way to modify the algorithm to support this? What would be the correct behavior in this circumstance.

Would it be sufficient to just not update player B's rating in this case, and only update player A's?

bernd-wechner commented 10 months ago

This is not a Trueskill question, it's a sanity question. In other words, yes and no ;-).

Yes: TrueSkill doesn't know about or care about player identity, in any way shape or form. It cares only about each players current rating entering the game. So you can run the algorithm using {A's rating, B's rating] vs [B's rating, B's rating]. There is nothing stopping you doing that at all.

No: The results will look like this [A's new rating, B's first new rating], [B's second new rating, B's third new rating]. The second and third will be ostensibly the same, as the performance of the second team is tied and the gains or losses to ratings are shared by all its members. B's first new rating will be markedly different. Which will you use, and why? Answer: No-one as this is an absurd situation that Trueskill neither cares about nor models.

The issue relates not to an oversight in design, so much as its simplicity. It focusses entirely on the Bayesian rating shift that a given ranking produces. The ranking you supply is between two teams, and the players in those teams are not identified in any way, there is no duplicity, there is simply the players rating at outset of the game (described with a mean and standard deviation couplet).

I wonder, what does it mean to you, that B played twice on one team and once on the other? Is this a 2 team game with 4 characters in it and B is playing three of those and A one of them? I daresay, if so, it's not exactly the kind of game play conducive to skill tracking. I mean how does the evidence of B beating B, contribute to a reassessment of B's skill? This looks like it's mostly being played solitaire and solitaire games have no measured result that TrueSkill can work with (or any rating system based on observed outcomes). It is possible to model solitaire games if one models the game as an opposing player, and gives it a rating and track that. Then record player losses as Game wins and player wins as that player. You could given the Game an unvarying constant rating even, ignoring updates to its assessment.