plot_scores function returns 0 for all but first and last value #8

markustoivonen opened 3 years ago

I am running kickscore for some data, and when plotting the scores with plot_scores function, the predict method returns mean 0 and variance 1 for all data points except the first and last timestamps. For the final and the last timestamps the predictions are the sames as the values in the Items scores attribute. There are some other anomalies with the score data, but this seems to be far most common one.


Here are 3 players plotted, and they all exhibit the same behaviour.

Also, why in the plot_scores function, we calculate the ms vector with the predict method, but not rather just take the stored values from the scores attributes of an Item?

Another observation I made was that I have players who have only won matches, but their score at the end is almost the same as in the beginning? Here is a picture of a player who has won roughly 20 matches and lost 0. Shouldn't their trajectory be monotonically increasing, even if the opponents are weak? The data below is from the scores attribute of the Item


I am using the BinaryModel, with exponential kernel (var=1, lscale=1) and recursive fitter.

All help is much appreciated!

Hi @markustoivonen

It's hard to say with certainty without looking at the data & code, but here are a few thoughts that might help you:

  1. in general, you might want to combine a dynamic kernel (e.g. Exponential) with a static one (Constant). The constant kernel will capture a player's baseline score, whereas the dynamic one will capture fluctuations around the baseline over time. If you don't, then the score will revert to zero during long stretches without observations.
  2. Regarding your first plot: the timescale (x-axis) appears very large (1e9), but the timescale of the exponential kernel is 1 (lscale=1). This is why the score is dropping very quickly to zero (the prior mean) outside of the precise moment the comparisons happened. lscale basically tells the model how far away in time should scores be correlated -> here, given the timescale of your data, the model thinks the score at any two time points should essentially be uncorrelated (hence be zero most of the time).
  3. Regarding your 2nd plot: kickscore works differently than Elo, etc, where players gain / lose rating points after every game. kickscore infers the score over time given the entire history of matches. This means that even if a player wins all the time, their score might fluctuate up & down, depending on 1) the strength of their opponents, 2) the interval between successive matches and 3) prior beliefs about the score & its evolution over time (given by the kernel).

Bottom line:

Regarding your question

Also, why in the plot_scores function, we calculate the ms vector with the predict method, but not rather just take the stored values from the scores attributes of an Item?

The scores attribute contains the mean & variance of the score only at times where the player played a game (you can check this by inspecting ts). When plotting it looks better if you show the score time-series using regularly spaced time intervals, not necessarily matching the timestamps of games a user played, hence the call to predict.

Hope this helps!

Hi @lucasmaystre! Thank you for taking the time to give such an thorough answer, I truly appreciate it. :)

I was able to fit the predict curve to the data points, so now the plotting curves makes more sense.

I have a few more follow up questions, hopefully they are not too strenuous.

The model assumes an unbiased random shift of abilities, so your assumption about the ability does not fall into that category. A simple model that would fit you criteria is actually a simple Elo, or if you want a more sophisticated one, Glicko2 or TrueSkill are good candidates. The graph would not be smooth (you actually only get point estimations at the time of doing the exercises), so keep that in mind.