Make the sharpness limit in WDLRescale configurable, and fix the Elo --> Contempt calculation

Naphthalin commented 7 months ago

During implementing #1791 we used a preliminary version in TCEC Swiss 4 together with a net with a mixed training set, and found that the Contempt effect was sometimes exaggerated. To address this, a number of measures was taken:

limit the Contempt taken from UCI_RatingAdv via a hidden setting ContemptMaxValue
limit the WDL derived "sharpness" (called s for scale in the WDLRescale function, following the logistic distribution nomenclature) to a hardcoded value of 1.4 (approx. twice the value of startpos)
redefine how the contempt effect is calculated for high rating differences, also making the accuracy Elo dependent, thus making high Contempt values safe and reasonable to use
ultimately, only use networks with "pure datasets" with higher contempt values.

While this together fixed the problem for good, any 2 of the 4 measures combined would likely already have helped, and it turns out that the hardcoded limit of 1.4 is a bit too conservative, which is especially noticeable when using it for material odds like in https://lczero.org/blog/2023/11/play-with-knight-odds-against-lc0-on-lichess/. This PR allows increasing the limit, thus addressing the original comment on the hardcoded constant.

Naphthalin commented 7 months ago

While testing this PR, I found that there was an actual bug in the way the diff parameter is calculated from WDLCalibrationElo and Contempt, forgetting to divide by WDLDrawRateReference^2 which effectively reduced the contempt effect for the bigger nets by a factor of up to 2, while using contempt without calibration Elo was unaffected. This PR now fixes this unintended behaviour, so using either of the two ways introduced in #1791 for specifying contempt works now as intended. Note however that WDLCalibrationElo still refers to game pair Elo, which results in a discrepancy from regular Elo up to a factor of 2 below 2600. To counteract that a difference of 100 Elo with WDLCalibrationElo: 1800 basically means 200 regular Elo difference, you can simply use "WDLContemptAttenuation": 0.5 and use the real Elo difference nonetheless. Around 2400, the correct value is probably around 0.8; I will attempt to fix this properly together with updating the Elo dependent draw rate curve at some point in the future.

Naphthalin commented 7 months ago

The last two commits added a conversion formula, translating regular Elo (as defined by the expected outcome following a logistic curve) which is also used when the alternative Contempt settings where WDLDrawRateTarget is set instead of WDLCalibrationElo to the internally used Elo (derived from game pair ratio, equivalent at higher levels to UHO Elo and more importantly UHO game pair level).

It still is supposed to represent (relatively fast) rapid Elo, so to get classic Elo, add something between 40 and 70 Elo per time doubling.

The conversion formula is an approximation to the model prediction for regular Elo from +1.00 openings, which itself is based on Stockfish level selfplay data to estimate the approximate draw rate resp. WDL sharpness, using https://github.com/official-stockfish/Stockfish/pull/4341.

Elo_approximation2

LeelaChessZero / lc0

Make the sharpness limit in WDLRescale configurable, and fix the Elo --> Contempt calculation #1941