official-stockfish / fishtest

The Stockfish testing framework
https://tests.stockfishchess.org/tests
270 stars 126 forks source link

SPRT Elo #1924

Open gahtan-syarif opened 3 months ago

gahtan-syarif commented 3 months ago

Right now the Elo from SPRT is derived from equation 6.1 of this document: https://hardy.uhasselt.be/Fishtest/brownian_approximation.pdf. so basically the SPRT elo implemented in fishtest right now is caclulated using the results of the GSPRT. i have issues with this as a tester as the resulting elo can behave quite weirdly such as elo being negative when theres more wins than losses. It would be preffered if regular pentanomial elo thats independent of the SPRT bounds are shown to get an accurate representation of the tested engines strength such as those that are displayed in fixed tests. a solution to this is by displaying both elo with the GSPRT approximated elo being labeled as SPRT Elo or Adjusted Elo while the regular elo is labelled as just Elo.

cj5716 commented 2 months ago

IMO we can replace the LOS "speedometer" with this