sublee / trueskill

An implementation of the TrueSkill rating system for Python
https://trueskill.org/
Other
742 stars 112 forks source link

TrueSkill Through Time #14

Open MarkLodato opened 7 years ago

MarkLodato commented 7 years ago

It would be nice if this module implemented TrueSkill Through Time (TTT), which fixes a few issues in the original TrueSkill algorithm, namely:

downhiller commented 7 years ago

I assume you didn't implement this yourself anywhere @MarkLodato ? Have you found an implementation of it anywhere? Thank you!

MarkLodato commented 7 years ago

Unfortunately no, I haven't found an implementation anywhere.

downhiller commented 7 years ago

It's surprising, given how old the paper is. I wish I could do it myself but it's probably a bit beyond my abilities, and I doubt I'd have the time even if I felt able to tackle it.

MarkLodato commented 7 years ago

No problem. If someone reads this and has the inclination, you'd be open to a pull request, I assume?

downhiller commented 7 years ago

I'm not the original author Mark, I just found your issue and commented on it.

MarkLodato commented 7 years ago

Oh :)

sublee commented 7 years ago

Sorry for my late response. Currently, I don't have a plan to implement TTT. But I will read the paper tonight to estimate the cost. Anyway, this project is open to your contributions. You guys can contribute with TTT.

EvanZ commented 6 years ago

At some point if you're going to calculate ratings going "backwards" in time, doesn't that negate the point of Elo in the first place? I mean, why not just use logistic regression at that point?

lucasmaystre commented 6 years ago

@EvanZ logistic regression assumes that there's a single, static skill parameter for every player. TTT assumes that the player's skill can change over time. This is orthogonal to the fact that in TTT, information can also flow backwards.

bernd-wechner commented 6 years ago

Having just read the paper I admit I don't see TTT is not something that "can be implemented" meaningfully without some significant qualification and work.

The qualification is that it's not a replacement for the existing code at all, but a different body of code. TrueSkill is what we're using to asses skill today based on history, and form leaderboards as does the Xbox community.

What TTT offers is a revision, of the time history. That is, as new data arrives, you could go and perform updates on all the historic ratings for a player. This presupposes a complete new data structure for ta module like this, which can provide it with the whole history of game interactions.

It's worth noting this is a potential nightmare and certainly not trivial, as TTT concerns itself specifically with chess a two player game. In our scenario for an N player game, and players mixing and matching you end up with what is an influence tree. That is in assessing the skill of player A you perforce end up reassessing the skill of all players that A has played with in their history, and here it gets fun, recursively ... that is if B played with A in the past and B needs updating, we need also to do this whole schmozzle for B.

In short order what it boils down to is you need to run a TTT update on a whole data set tat is assumed closed. That is has only internal references ... all players that are playing have their whole (assesed) histories recorded and all these is what TTT walks through. So suddenly TTT is not a python module that just takes N players skill ratings and a game result to produce an update, but it needs access to a whole history of all those players (and all the players they played with and so on) if it's going to propagate the new information backwards.

Now it would be an interesting thing to do, indeed, but non-trivial, be running in the background and presumably consume a pile of computing resource on every run so might depending on traffic (update) volume be a nightly run or such, to provide updated estimates of historic skills.

Any robust database would choose to store the original estimates (basic TrueSkill rankings seen on the way here) and store the latest update beside it, so that a comparison is possible (we could draw a graph of the players Skill evolution as it happened, and the latest full TTT improved history estimate). It would indeed be awesomely cool to see how they differed.

lucasmaystre commented 6 years ago

In case anyone's interested, I have forked the original TTT code (written in F#) with some instructions on how to run it on a modern Linux system:

https://github.com/lucasmaystre/chessanalysis

bernd-wechner commented 6 years ago

Very awesome. Without the time to study it now, do you think:

1) it generalizes to N players in M teams easily. Or is is fairly heavily build on the 1 on 1 assumption (which the paper suggests)?

2) it would port to Python easily? Probably hard to say without giving it a shot.

bernd-wechner commented 6 years ago

Hmmm, just reading about TrueSkill 2. Interesting ... a significant upgrade of sorts, with a detailed paper, but can't find any code on offer yet. Paper only published in March 2018. Of note though is that they generate the code using Infer.NET, apparently a package that lets you specify the factor graph in an abstract way, and will generate code that solves the factor graph. Sounds neat. I see a future possibly for more easily tuned Baysian skill models. Who knows?

bernd-wechner commented 6 years ago

Just got confirmation from Minka that there are no plans to release TrueSkill 2 code the way TTT code was released. To wit if the community wants to leverage learnings Microsoft Research are sharing in the TrueSkill 2 paper it seems for now it demands a reimplementation. But the pointers to Infer.NET and the way these factor graphs are created is enough to get someone with time and motivation started. On my todo list but somewhere down at that level that might be done in retirement ;-).

lucasmaystre commented 4 years ago

Hi all, I have just publicly released a Python package for estimating time-varying skills based on the entire history of matches. It can be thought of as an extension of TTT (the time dynamics are more general). You can find it here: https://github.com/lucasmaystre/kickscore

It's based on a paper I'll present later this summer at the KDD conference: https://arxiv.org/abs/1903.07746

Note that at the moment it only supports pairwise comparisons (either 1-vs-1, or many-vs-many).

mrkvicka22 commented 3 years ago

Just got confirmation from Minka that there are no plans to release TrueSkill 2 code the way TTT code was released. To wit if the community wants to leverage learnings Microsoft Research are sharing in the TrueSkill 2 paper it seems for now it demands a reimplementation. But the pointers to Infer.NET and the way these factor graphs are created is enough to get someone with time and motivation started. On my todo list but somewhere down at that level that might be done in retirement ;-).

Hey do you have any idea whethter there is any implementation of TrueSkill 2 available?

glandfried commented 2 years ago

Hi all,

We have implemented [1] and documented [2] the first TrueSkill Through Time packages for Python, Julia and R. I wish to express my gratitude to Heungsub Lee for having published the basic TrueSkill model in Python. I tried to keep the same interface until I was forced to change it (some changes I made could have been avoided). I remain at your disposal to collaborate in case you wish to update the original package based on the TrueSkill Through Time model that we have implemented.

python3 -m pip install trueskillthroughtime

[1, 2] https://github.com/glandfried/TrueSkillThroughTime [2] https://github.com/glandfried/TrueSkillThroughTime/releases/download/doc/landfried-learning.pdf