kiudee / chess-tuning-tools

A collection of scripts aimed at efficiently tuning chess engine parameters.
https://chess-tuning-tools.readthedocs.io/en/latest/
Other
52 stars 13 forks source link

Observations with normalize_y True versus False #117

Open Claes1981 opened 3 years ago

Claes1981 commented 3 years ago

Description

I have experimented with changing the "gp_kwargs" parameter "normalize_y" between True and False in cli.py. (First by manually editing the code locally in my virtual environment, and later by implementing an option.)

You changed this normalize_y parameter from True to False in your code on September 20 "Due to a bug in scikit-learn 0.23.2". Is it this issue you refer to?

In https://github.com/scikit-optimize/scikit-optimize/blob/master/skopt/learning/gaussian_process/gpr.py it is written under normalize_y: "...This parameter should be set to True if the target values' mean is expected to differ considerable from zero. ..." I am not sure what is "considerable" here, but I set the time control of engine2 in my experiments much lower than engine1 to make the running time shorter (while still tuning at longer time control of engine1). I therefore expect the Elo of the optimum to be greater than zero.

In my few specific experiments the tuner has at least not crashed so far when having normalize_y set to True. Instead the Elo of the optimum after 953 iterations/1906 games is 24.2045 +- 5.1086 when resuming with normalize_y=True, but "only" 21.4418 +- 11.3574 when resuming at the same iteration with normalize_y=False.

Output

normalize_y=True:

2021-01-02 11:46:54,099 DEBUG    Got the following tuning settings:
{'engine1_tc': '15+15', 'engine2_tc': '1+1', 'rounds': 1}
2021-01-02 11:46:54,099 DEBUG    Acquisition function: ei, Acquisition function samples: 1, GP burnin: 75, GP samples: 300, GP initial burnin: 200, GP initial samples: 300, Normalize_y: True, Initial points: 16, Next points: 500, Random seed: 0
2021-01-02 11:46:54,111 INFO     Importing 953 existing datapoints. This could take a while...
2021-01-02 12:13:23,069 INFO     Importing finished.
2021-01-02 12:13:23,069 INFO     Starting iteration 953
2021-01-02 12:13:49,921 INFO     Current optimum:
{'Threads': 6, 'Hash': 6772, 'SyzygyProbeLimit': 1, 'SyzygyProbeDepth': 79, 'Slow Mover': 817, 'Move Overhead': 0}
2021-01-02 12:13:49,921 INFO     Estimated Elo: 24.2045 +- 5.1086
2021-01-02 12:13:49,921 INFO     80.0% confidence interval of the Elo value: (17.6576, 30.7514)
2021-01-02 12:13:50,188 INFO     80.0% confidence intervals of the parameters:
Parameter         Lower bound  Upper bound
------------------------------------------
Threads                     3            6
Hash                     1125         8191
SyzygyProbeLimit            0            6
SyzygyProbeDepth            2           73
Slow Mover                115          919
Move Overhead              20         4252

2021-01-02 12:13:50,188 DEBUG    Starting to compute the next plot.
2021-01-02 12:16:18,188 INFO     Saving a plot to Cfish_20100303_S200723-1134/20210102-121350-953.png.

20210102-121350-953

normalize_y=False:

2021-01-02 12:16:43,175 DEBUG    Got the following tuning settings:
{'engine1_tc': '15+15', 'engine2_tc': '1+1', 'rounds': 1}
2021-01-02 12:16:43,175 DEBUG    Acquisition function: ei, Acquisition function samples: 1, GP burnin: 75, GP samples: 300, GP initial burnin: 200, GP initial samples: 300, Normalize_y: False, Initial points: 16, Next points: 500, Random seed: 0
2021-01-02 12:16:43,186 INFO     Importing 953 existing datapoints. This could take a while...
2021-01-02 12:40:37,007 INFO     Importing finished.
2021-01-02 12:40:37,012 INFO     Starting iteration 953
2021-01-02 12:40:59,907 INFO     Current optimum:
{'Threads': 4, 'Hash': 4490, 'SyzygyProbeLimit': 3, 'SyzygyProbeDepth': 50, 'Slow Mover': 435, 'Move Overhead': 3725}
2021-01-02 12:40:59,907 INFO     Estimated Elo: 21.4418 +- 11.3574
2021-01-02 12:40:59,907 INFO     80.0% confidence interval of the Elo value: (6.8867, 35.997)
2021-01-02 12:41:00,246 INFO     80.0% confidence intervals of the parameters:
Parameter         Lower bound  Upper bound
------------------------------------------
Threads                     1            5
Hash                      190         6769
SyzygyProbeLimit            1            6
SyzygyProbeDepth           12           91
Slow Mover                212          998
Move Overhead             780         4913

2021-01-02 12:41:00,247 DEBUG    Starting to compute the next plot.
2021-01-02 12:43:11,296 INFO     Saving a plot to Cfish_20100303_S200723-1134/20210102-124101-953.png.

20210102-124101-953

Do you have any comments on the results? Do you also think that it is correct that the optimum found by normalize_y=True is more likely to be superior to the optimum found by normalize_y=False, than the other way around?

Full log: Cfish_20100303_S200723-1134.log

Data: Cfish_20100303_S200723-1134.npz.zip

Config file: Cfish_20100303_S200723-1134.json.zip

kiudee commented 3 years ago

In general I would set normalize_y to True. Due to the upstream issue you mentioned, it can happen that a division by zero error will result if the standard deviation is still zero. It looks like the fix was merged in November: https://github.com/scikit-learn/scikit-learn/pull/18831

I will investigate, how easy it would be to update everything to scikit-learn 0.24 (the new version).

In your second plot, you can see that the playing strength is becoming worse towards the edges everywhere. That could be an indication that the Gaussian process wants to revert to the mean (which is zero) there.

Claes1981 commented 3 years ago

Thanks, nice to hear that my suspicion that it is good to set normalize_y=True (as long as no error appears) is confirmed by you.