lichess-org / database

Public exports of all rated games, puzzles, and computer evaluations.
https://database.lichess.org
GNU Affero General Public License v3.0
86 stars 28 forks source link

Missing Volatility Data in Chess Puzzle Dataset #56

Open katoue opened 4 months ago

katoue commented 4 months ago

In Glicko2, calculating a new player’s rating, rating deviation (RD), and volatility after a match requires all three metrics: rating, RD, and volatility . I've been trying to simulate some new matches with the chess puzzle dataset. The absence of volatility data can led to inaccurate calculations of the new puzzle ratings and RD. I read the lila code only discover the default volatility to be 0.09 and a maximum of 0.1. Is there anyway to access this data?

ornicar commented 4 months ago

It didn't seem useful to add volatility and I'm still not sure it is, as I don't quite understand why you need it? to "simulate some new matches"?

katoue commented 4 months ago
Every player in the Glicko-2 system has a rating, r, a rating deviation, RD, and a rating
volatility σ. The volatility measure indicates the degree of expected fluctuation in a player’s
rating. The volatility measure is high when a player has erratic performances (e.g., when
the player has had exceptionally strong results after a period of stability), and the volatility
measure is low when the player performs at a consistent level.
http://www.glicko.net/glicko/glicko2.pdf

Volatility can reflect whether the difficulty level is steady among different levels of players. It is also a factor used to calculate the new puzzle rating if a player solves or fails it.

Dboingue commented 1 month ago

It would mean how much the puzzle is being experienced and the puzzle individual rating confidence level. The volatility is just the mechanism by which glicko2 manages the dynamics of the pairing in the pool, and its influence on the rating estiamte confidence, this might provide population scale data anslsysi power for more fine grain questions, also not part of the insufficient open data effort that the unknown date of the puzzle database has left us wanting. For a long time. It seems that the invidual tool premise is dominating the reasoning, from my limited individual point of view, eager to understand what things being what they are, are as they are, and have been for such a while.

The subjective theme votation data. The ratings. these are all population based and dynamic. I am less interest by the contribution of volatility to the per pairing instance change amplitude, and more interested by the uncertainty information it also controls more directly. A sort of memory of the past level of pool "exposure" over time.

for the dating issue: https://github.com/lichess-org/database/issues/58#issue-2568780635

katoue commented 1 month ago

When you need to determine a puzzle rating for a newly introduced player, such as a bot, you will need volatility to calculate it. http://www.glicko.net/glicko/glicko2.pdf