fsmosca / Optuna-Game-Parameter-Tuner

A game search and evaluation parameter tuner using optuna framework
MIT License
14 stars 3 forks source link

Base engine with default values #1

Closed joergoster closed 4 years ago

joergoster commented 4 years ago

@fsmosca Hi, can you please make a version with a non-changing base engine, always initialized with the default values? I would like to give this a try instead of the current 'moving target'.

fsmosca commented 4 years ago

Yes I will do that.

Sample study session:

Optuna Game Parameter Tuner v0.3.0

python -u tuner.py --engine .\engines\deuterium\deuterium.exe --hash 128 --opening-file .\start_opening\ogpt_chess_startpos.epd --games-per-trial 50 --concurrency 6 --plot --study-name pv10 --fix-base-param --base-time-sec 10
trials: 1000, games_per_trial: 50
input param: OrderedDict([('PawnValueEn', {'default': 92, 'min': 90, 'max': 120, 'step': 2}), ('BishopValueOp', {'default': 350, 'min': 300, 'max': 360, 'step': 3}), ('BishopValueEn', {'default': 350, 'min': 300, 'max': 360, 'step': 3}), ('RookValueEn', {'default': 525, 'min': 490, 'max': 550, 'step': 5}), ('QueenValueOp', {'default': 985, 'min': 975, 'max': 1050, 'step': 5}), ('MobilityWeight', {'default': 100, 'min': 50, 'max': 150, 'step': 4})])

[I 2020-09-20 11:09:19,142] A new study created in RDB with name: pv10
Warning, best value from previous trial is not found!
study best value: 0.0
Warning, best param from previous trial is not found!.
study best param: {}

starting trial: 0 ...
suggested param for test engine: {'PawnValueEn': 102, 'BishopValueOp': 333, 'BishopValueEn': 342, 'RookValueEn': 520, 'QueenValueOp': 1000, 'MobilityWeight': 142}
param for base engine          : {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init param: {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init value: 0.5
study best param: {}
study best value: 0.0
Actual match result: 0.48, point of view: optimizer suggested values
[I 2020-09-20 11:12:11,851] Trial 0 finished with value: 0.48 and parameters: {'PawnValueEn': 102, 'BishopValueOp': 333, 'BishopValueEn': 342, 'RookValueEn': 520, 'QueenValueOp': 1000, 'MobilityWeight': 142}. Best is trial 0 with value: 0.48.

starting trial: 1 ...
suggested param for test engine: {'PawnValueEn': 118, 'BishopValueOp': 360, 'BishopValueEn': 357, 'RookValueEn': 505, 'QueenValueOp': 990, 'MobilityWeight': 90}
param for base engine          : {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init param: {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init value: 0.5
study best param: {'PawnValueEn': 102, 'BishopValueOp': 333, 'BishopValueEn': 342, 'RookValueEn': 520, 'QueenValueOp': 1000, 'MobilityWeight': 142}
study best value: 0.48
Actual match result: 0.54, point of view: optimizer suggested values
[I 2020-09-20 11:15:19,208] Trial 1 finished with value: 0.54 and parameters: {'PawnValueEn': 118, 'BishopValueOp': 360, 'BishopValueEn': 357, 'RookValueEn': 505, 'QueenValueOp': 990, 'MobilityWeight': 90}. Best is trial 1 with value: 0.54.

starting trial: 2 ...
suggested param for test engine: {'PawnValueEn': 98, 'BishopValueOp': 318, 'BishopValueEn': 327, 'RookValueEn': 495, 'QueenValueOp': 990, 'MobilityWeight': 66}
param for base engine          : {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init param: {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init value: 0.5
study best param: {'PawnValueEn': 118, 'BishopValueOp': 360, 'BishopValueEn': 357, 'RookValueEn': 505, 'QueenValueOp': 990, 'MobilityWeight': 90}
study best value: 0.54
Actual match result: 0.52, point of view: optimizer suggested values
[I 2020-09-20 11:18:15,684] Trial 2 finished with value: 0.52 and parameters: {'PawnValueEn': 98, 'BishopValueOp': 318, 'BishopValueEn': 327, 'RookValueEn': 495, 'QueenValueOp': 990, 'MobilityWeight': 66}. Best is trial 1 with value: 0.54.

starting trial: 3 ...

...

starting trial: 9 ...
suggested param for test engine: {'PawnValueEn': 114, 'BishopValueOp': 357, 'BishopValueEn': 339, 'RookValueEn': 520, 'QueenValueOp': 1000, 'MobilityWeight': 62}
param for base engine          : {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init param: {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init value: 0.5
study best param: {'PawnValueEn': 118, 'BishopValueOp': 360, 'BishopValueEn': 357, 'RookValueEn': 505, 'QueenValueOp': 990, 'MobilityWeight': 90}
study best value: 0.54
Actual match result: 0.57, point of view: optimizer suggested values
[I 2020-09-20 11:39:47,624] Trial 9 finished with value: 0.57 and parameters: {'PawnValueEn': 114, 'BishopValueOp': 357, 'BishopValueEn': 339, 'RookValueEn': 520, 'QueenValueOp': 1000, 'MobilityWeight': 62}. Best is trial 9 with value: 0.57.
Saving plots ...
Done saving plots.
 number  value  params_BishopValueEn  params_BishopValueOp  params_MobilityWeight  params_PawnValueEn  params_QueenValueOp  params_RookValueEn     state
      0   0.48                   342                   333                    142                 102                 1000                 520  COMPLETE
      1   0.54                   357                   360                     90                 118                  990                 505  COMPLETE
      2   0.52                   327                   318                     66                  98                  990                 495  COMPLETE
      3   0.37                   315                   321                    134                 110                 1045                 495  COMPLETE
      4   0.44                   315                   330                    146                 112                  985                 520  COMPLETE
      5   0.42                   315                   330                    134                 112                  980                 545  COMPLETE
      6   0.50                   348                   333                    118                 120                 1020                 515  COMPLETE
      7   0.52                   333                   300                     54                 104                  995                 510  COMPLETE
      8   0.46                   360                   309                    122                 110                 1015                 490  COMPLETE
      9   0.57                   339                   357                     62                 114                 1000                 520  COMPLETE

study best param: {'BishopValueEn': 339, 'BishopValueOp': 357, 'MobilityWeight': 62, 'PawnValueEn': 114, 'QueenValueOp': 1000, 'RookValueEn': 520}
study best value: 0.57
study best trial number: 9
[I 2020-09-20 11:39:55,027] Using an existing study with name 'pv10' instead of creating a new one.
study best value: 0.57
study best param: {'BishopValueEn': 339, 'BishopValueOp': 357, 'MobilityWeight': 62, 'PawnValueEn': 114, 'QueenValueOp': 1000, 'RookValueEn': 520}

starting trial: 10 ...
suggested param for test engine: {'PawnValueEn': 94, 'BishopValueOp': 357, 'BishopValueEn': 300, 'RookValueEn': 535, 'QueenValueOp': 1030, 'MobilityWeight': 82}
param for base engine          : {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init param: {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init value: 0.5
study best param: {'BishopValueEn': 339, 'BishopValueOp': 357, 'MobilityWeight': 62, 'PawnValueEn': 114, 'QueenValueOp': 1000, 'RookValueEn': 520}
study best value: 0.57
Actual match result: 0.49, point of view: optimizer suggested values
[I 2020-09-20 11:43:02,186] Trial 10 finished with value: 0.49 and parameters: {'PawnValueEn': 94, 'BishopValueOp': 357, 'BishopValueEn': 300, 'RookValueEn': 535, 'QueenValueOp': 1030, 'MobilityWeight': 82}. Best is trial 9 with value: 0.57.

starting trial: 11 ...

...

starting trial: 24 ...
suggested param for test engine: {'PawnValueEn': 114, 'BishopValueOp': 354, 'BishopValueEn': 348, 'RookValueEn': 525, 'QueenValueOp': 1005, 'MobilityWeight': 78}
param for base engine          : {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init param: {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init value: 0.5
study best param: {'BishopValueEn': 339, 'BishopValueOp': 357, 'MobilityWeight': 62, 'PawnValueEn': 114, 'QueenValueOp': 1000, 'RookValueEn': 520}
study best value: 0.57
Actual match result: 0.62, point of view: optimizer suggested values
[I 2020-09-20 12:24:51,024] Trial 24 finished with value: 0.62 and parameters: {'PawnValueEn': 114, 'BishopValueOp': 354, 'BishopValueEn': 348, 'RookValueEn': 525, 'QueenValueOp': 1005, 'MobilityWeight': 78}. Best is trial 24 with value: 0.62.

Game test

The optimizer values won by a small margin. TC=15s+100ms.

Score of deuterium_pv10_trial_24 vs deuterium_default: 103 - 78 - 219  [0.531] 400
...      deuterium_pv10_trial_24 playing White: 54 - 37 - 109  [0.542] 200
...      deuterium_pv10_trial_24 playing Black: 49 - 41 - 110  [0.520] 200
...      White vs Black: 95 - 86 - 219  [0.511] 400
Elo difference: 21.7 +/- 22.9, LOS: 96.8 %, DrawRatio: 54.8 %
joergoster commented 4 years ago

Thank you!

So it seems this already worked for you. I will now start a tuning run for SF's KingAttackWeights. I will let you know the result.

There is one culprit remaining, though. Just like all BO implementations I know, Optuna also doesn't handle noisy evaluations. Maybe I'll open an issue about it on their github page.

One way to deal with this is to keep track of the best evaluated points and return the mean of them instead of simply the best point. I already mentioned it here https://github.com/thomasahle/fastchess/issues/24 back then ...

fsmosca commented 4 years ago

Tuning

I did the kingattackweights but only 4 options on a fast tc of 2s+50ms at 100 games per trial.

python -u tuner7.py --engine .\engines\stockfish-modern\stockfish.exe --hash 128 --opening-file .\start_opening\ogpt_chess_startpos.epd --games-per-trial 100 --concurrency 7 --plot --study-name sf_eval_kaw5 --fix-base-param --base-time-sec 2 --pgn-output sf_eval_kaw5.pgn
trials: 1000, games_per_trial: 100
input param: OrderedDict([('KingAttackWeights[2]', {'default': 81, 'min': 30, 'max': 130, 'step': 4}), ('KingAttackWeights[3]', {'default': 52, 'min': 10, 'max': 110, 'step': 4}), ('KingAttackWeights[4]', {'default': 44, 'min': 10, 'max': 110, 'step': 4}), ('KingAttackWeights[5]', {'default': 10, 'min': 0, 'max': 60, 'step': 4})])

[I 2020-09-20 20:33:10,339] A new study created in RDB with name: sf_eval_kaw5
Warning, best value from previous trial is not found!
study best value: 0.0
Warning, best param from previous trial is not found!.
study best param: {}

starting trial: 0 ...
suggested param for test engine: {'KingAttackWeights[2]': 70, 'KingAttackWeights[3]': 78, 'KingAttackWeights[4]': 102, 'KingAttackWeights[5]': 12}
param for base engine          : {'KingAttackWeights[2]': 81, 'KingAttackWeights[3]': 52, 'KingAttackWeights[4]': 44, 'KingAttackWeights[5]': 10}
init param: {'KingAttackWeights[2]': 81, 'KingAttackWeights[3]': 52, 'KingAttackWeights[4]': 44, 'KingAttackWeights[5]': 10}
init value: 0.5
study best param: {}
study best value: 0.0
Actual match result: 0.45, point of view: optimizer suggested values
[I 2020-09-20 20:35:34,609] Trial 0 finished with value: 0.45 and parameters: {'KingAttackWeights[2]': 70, 'KingAttackWeights[3]': 78, 'KingAttackWeights[4]': 102, 'KingAttackWeights[5]': 12}. Best is trial 0 with value: 0.45.

starting trial: 1 ...
suggested param for test engine: {'KingAttackWeights[2]': 38, 'KingAttackWeights[3]': 70, 'KingAttackWeights[4]': 46, 'KingAttackWeights[5]': 32}
param for base engine          : {'KingAttackWeights[2]': 81, 'KingAttackWeights[3]': 52, 'KingAttackWeights[4]': 44, 'KingAttackWeights[5]': 10}
init param: {'KingAttackWeights[2]': 81, 'KingAttackWeights[3]': 52, 'KingAttackWeights[4]': 44, 'KingAttackWeights[5]': 10}
init value: 0.5
study best param: {'KingAttackWeights[2]': 70, 'KingAttackWeights[3]': 78, 'KingAttackWeights[4]': 102, 'KingAttackWeights[5]': 12}
study best value: 0.45
Actual match result: 0.535, point of view: optimizer suggested values
[I 2020-09-20 20:37:54,483] Trial 1 finished with value: 0.535 and parameters: {'KingAttackWeights[2]': 38, 'KingAttackWeights[3]': 70, 'KingAttackWeights[4]': 46, 'KingAttackWeights[5]': 32}. Best is trial 1 with value: 0.535.

...

starting trial: 10 ...
suggested param for test engine: {'KingAttackWeights[2]': 130, 'KingAttackWeights[3]': 14, 'KingAttackWeights[4]': 18, 'KingAttackWeights[5]': 16}
param for base engine          : {'KingAttackWeights[2]': 81, 'KingAttackWeights[3]': 52, 'KingAttackWeights[4]': 44, 'KingAttackWeights[5]': 10}
init param: {'KingAttackWeights[2]': 81, 'KingAttackWeights[3]': 52, 'KingAttackWeights[4]': 44, 'KingAttackWeights[5]': 10}
init value: 0.5
study best param: {'KingAttackWeights[2]': 38, 'KingAttackWeights[3]': 70, 'KingAttackWeights[4]': 46, 'KingAttackWeights[5]': 32}
study best value: 0.535
Actual match result: 0.56, point of view: optimizer suggested values
[I 2020-09-20 21:00:01,173] Trial 10 finished with value: 0.56 and parameters: {'KingAttackWeights[2]': 130, 'KingAttackWeights[3]': 14, 'KingAttackWeights[4]': 18, 'KingAttackWeights[5]': 16}. Best is trial 10 with value: 0.56.

Game test

At trial 10 it gets 0.56 from a 100-game trial, I tested this in game test at 5s+100ms. Looks ok to me it does not go down after 1000 games. I have limited resources, I cannot test this further.

Score of sf_kaw5_trial_10 vs sf_default: 234 - 230 - 536  [0.502] 1000
...      sf_kaw5_trial_10 playing White: 125 - 105 - 270  [0.520] 500
...      sf_kaw5_trial_10 playing Black: 109 - 125 - 266  [0.484] 500
...      White vs Black: 250 - 214 - 536  [0.518] 1000
Elo difference: 1.4 +/- 14.7, LOS: 57.4 %, DrawRatio: 53.6 %

Looking forward on your test results.

joergoster commented 4 years ago

My tuning try has finished as well.

python tuner-kingattackweights.py --fix-base-param --engine ./engines/stockfish.exe --opening-file ./start_opening/ogpt_chess_startpos.epd --trials 400 --games-per-trial 20 --concurrency 3 --base-time-sec 2 --inc-time-sec 0.02 --study-name KingAttackWeights2 --plot

The result:

study best param: {'KingAttackWeights[2]': 20, 'KingAttackWeights[3]': 0, 'KingAttackWeights[4]': 0, 'KingAttackWeights[5]': 74} study best value: 0.775 study best trial number: 217

Not sure I will even validate those values since they look way off.

fsmosca commented 4 years ago

Perhaps this one is too small: --games-per-trial 20

I was using 100.

BTW what parameter range and steps did you use? I used:

input param: OrderedDict([('KingAttackWeights[2]', {'default': 81, 'min': 30, 'max': 130, 'step': 4}), ('KingAttackWeights[3]', {'default': 52, 'min': 10, 'max': 110, 'step': 4}), ('KingAttackWeights[4]', {'default': 44, 'min': 10, 'max': 110, 'step': 4}), ('KingAttackWeights[5]', {'default': 10, 'min': 0, 'max': 60, 'step': 4})])
joergoster commented 4 years ago

This is what I used:

    input_param.update({'KingAttackWeights[2]': {'default': 2, 'min': 0, 'max': 200, 'step': 2}})
    input_param.update({'KingAttackWeights[3]': {'default': 2, 'min': 0, 'max': 200, 'step': 2}})
    input_param.update({'KingAttackWeights[4]': {'default': 2, 'min': 0, 'max': 200, 'step': 2}})
    input_param.update({'KingAttackWeights[5]': {'default': 2, 'min': 0, 'max': 200, 'step': 2}})

And yes, 20 games per trial is very likely simply not enough.

joergoster commented 4 years ago

I started another run for the KingAttackWeights, this time with a step size of 4 for each parameter, 200 trials and with 160 games per trial. Will post the result when it is finished.

Btw., do you know this interesting discussion https://github.com/glinscott/fishtest/issues/774?

fsmosca commented 4 years ago

I started another run for the KingAttackWeights, this time with a step size of 4 for each parameter, 200 trials and with 160 games per trial. Will post the result when it is finished.

Ok nice.

Btw., do you know this interesting discussion glinscott/fishtest#774?

No, just know it. Thanks.

joergoster commented 4 years ago

Tuning finished.

[I 2020-09-22 09:53:20,011] Trial 199 finished with value: 0.512 and parameters: {'KingAttackWeights[2]': 36, 'KingAttackWeights[3]': 40, 'KingAttackWeights[4]': 20, 'KingAttackWeights[5]': 52}. Best is trial 119 with value: 0.556. Saving plots ... Done saving plots. number value params_KingAttackWeights[2] params_KingAttackWeights[3] params_KingAttackWeights[4] params_KingAttackWeights[5] state 0 0.225 32 172 112 120 COMPLETE 1 0.362 0 172 88 72 COMPLETE 2 0.234 28 172 104 96 COMPLETE 3 0.263 80 136 144 104 COMPLETE 4 0.519 24 72 60 104 COMPLETE 5 0.306 88 176 124 32 COMPLETE 6 0.447 128 88 76 28 COMPLETE 7 0.362 152 84 184 48 COMPLETE 8 0.447 60 88 0 148 COMPLETE 9 0.275 8 128 96 200 COMPLETE 10 0.350 192 12 28 172 COMPLETE 11 0.491 136 40 56 8 COMPLETE 12 0.456 136 36 44 4 COMPLETE 13 0.356 180 48 52 136 COMPLETE 14 0.459 112 0 0 80 COMPLETE 15 0.447 164 52 64 68 COMPLETE 16 0.472 60 64 20 0 COMPLETE 17 0.394 112 16 32 148 COMPLETE 18 0.322 148 108 68 180 COMPLETE 19 0.506 60 32 8 116 COMPLETE 20 0.478 40 68 12 120 COMPLETE 21 0.509 56 32 48 96 COMPLETE 22 0.494 60 24 40 96 COMPLETE 23 0.475 16 64 0 116 COMPLETE 24 0.484 44 24 76 84 COMPLETE 25 0.509 72 0 16 56 COMPLETE 26 0.534 76 4 28 56 COMPLETE 27 0.491 84 0 24 56 COMPLETE 28 0.459 100 4 32 40 COMPLETE 29 0.272 24 196 120 16 COMPLETE 30 0.453 72 120 16 60 COMPLETE 31 0.512 48 12 52 76 COMPLETE 32 0.438 40 20 48 80 COMPLETE 33 0.431 4 52 60 100 COMPLETE 34 0.500 44 12 88 88 COMPLETE 35 0.456 24 76 80 72 COMPLETE 36 0.459 52 104 40 108 COMPLETE 37 0.463 76 8 140 48 COMPLETE 38 0.225 96 152 200 24 COMPLETE 39 0.506 32 0 104 64 COMPLETE 40 0.531 68 44 32 48 COMPLETE 41 0.491 68 44 36 40 COMPLETE 42 0.494 92 60 8 48 COMPLETE 43 0.503 32 32 52 88 COMPLETE 44 0.409 52 80 64 128 COMPLETE 45 0.397 84 92 88 72 COMPLETE 46 0.472 68 24 24 36 COMPLETE 47 0.512 16 36 72 92 COMPLETE 48 0.403 12 56 72 108 COMPLETE 49 0.456 0 72 84 76 COMPLETE 50 0.466 20 44 56 24 COMPLETE 51 0.500 48 36 96 92 COMPLETE 52 0.503 108 12 28 52 COMPLETE 53 0.528 32 16 16 64 COMPLETE 54 0.491 32 24 44 64 COMPLETE 55 0.512 8 40 68 68 COMPLETE 56 0.459 8 48 36 68 COMPLETE 57 0.534 36 32 56 80 COMPLETE 58 0.425 0 92 60 44 COMPLETE 59 0.478 36 56 4 32 COMPLETE 60 0.491 20 32 24 108 COMPLETE 61 0.438 12 40 68 84 COMPLETE 62 0.500 28 28 68 60 COMPLETE 63 0.506 60 16 48 76 COMPLETE 64 0.503 40 8 56 56 COMPLETE 65 0.422 52 16 32 100 COMPLETE 66 0.519 64 44 40 68 COMPLETE 67 0.503 64 68 20 80 COMPLETE 68 0.500 80 52 44 92 COMPLETE 69 0.503 76 44 12 68 COMPLETE 70 0.503 48 20 40 76 COMPLETE 71 0.428 24 36 76 84 COMPLETE 72 0.509 16 40 60 60 COMPLETE 73 0.469 40 20 48 52 COMPLETE 74 0.438 64 8 36 92 COMPLETE 75 0.506 8 60 80 64 COMPLETE 76 0.466 56 28 52 100 COMPLETE 77 0.509 44 4 16 112 COMPLETE 78 0.512 88 28 28 128 COMPLETE 79 0.484 36 48 36 44 COMPLETE 80 0.459 92 28 28 140 COMPLETE 81 0.475 80 12 20 164 COMPLETE 82 0.428 100 36 44 124 COMPLETE 83 0.509 68 4 32 72 COMPLETE 84 0.431 124 40 96 52 COMPLETE 85 0.506 84 20 52 132 COMPLETE 86 0.412 24 64 64 68 COMPLETE 87 0.466 28 52 72 80 COMPLETE 88 0.453 16 16 60 88 COMPLETE 89 0.434 48 44 112 60 COMPLETE 90 0.397 56 36 72 104 COMPLETE 91 0.356 76 116 28 152 COMPLETE 92 0.438 4 32 40 76 COMPLETE 93 0.456 36 28 8 120 COMPLETE 94 0.516 88 56 56 72 COMPLETE 95 0.459 88 56 56 64 COMPLETE 96 0.463 20 80 64 56 COMPLETE 97 0.463 104 48 40 96 COMPLETE 98 0.478 92 68 24 72 COMPLETE 99 0.528 72 56 48 44 COMPLETE 100 0.469 72 60 48 36 COMPLETE 101 0.522 68 24 56 44 COMPLETE 102 0.506 64 48 56 40 COMPLETE 103 0.484 56 56 52 48 COMPLETE 104 0.463 80 44 68 32 COMPLETE 105 0.541 72 40 80 24 COMPLETE 106 0.494 68 40 80 16 COMPLETE 107 0.500 72 72 44 16 COMPLETE 108 0.497 76 52 64 44 COMPLETE 109 0.359 84 64 168 24 COMPLETE 110 0.494 88 24 32 28 COMPLETE 111 0.463 60 28 48 52 COMPLETE 112 0.497 68 32 80 4 COMPLETE 113 0.472 72 8 12 60 COMPLETE 114 0.506 52 16 56 84 COMPLETE 115 0.494 96 24 20 48 COMPLETE 116 0.547 80 36 36 36 COMPLETE 117 0.531 80 36 36 32 COMPLETE 118 0.512 80 44 36 36 COMPLETE 119 0.556 96 36 40 28 COMPLETE 120 0.494 112 36 40 28 COMPLETE 121 0.512 84 48 44 12 COMPLETE 122 0.506 96 40 32 20 COMPLETE 123 0.469 64 52 36 32 COMPLETE 124 0.444 76 32 24 40 COMPLETE 125 0.472 92 60 52 44 COMPLETE 126 0.472 80 32 16 20 COMPLETE 127 0.481 104 36 44 28 COMPLETE 128 0.506 72 44 28 36 COMPLETE 129 0.434 64 20 60 56 COMPLETE 130 0.519 84 56 48 48 COMPLETE 131 0.503 88 56 48 48 COMPLETE 132 0.509 76 52 40 40 COMPLETE 133 0.500 84 72 36 32 COMPLETE 134 0.472 100 48 52 44 COMPLETE 135 0.466 68 40 32 56 COMPLETE 136 0.444 200 24 44 36 COMPLETE 137 0.472 80 88 56 52 COMPLETE 138 0.522 88 64 60 24 COMPLETE 139 0.478 72 64 40 20 COMPLETE 140 0.463 92 32 48 24 COMPLETE 141 0.466 88 76 60 64 COMPLETE 142 0.519 96 56 60 12 COMPLETE 143 0.503 104 64 64 4 COMPLETE 144 0.512 84 60 24 12 COMPLETE 145 0.388 108 140 88 12 COMPLETE 146 0.487 96 44 32 28 COMPLETE 147 0.500 76 48 52 40 COMPLETE 148 0.506 116 36 68 0 COMPLETE 149 0.500 96 0 44 32 COMPLETE 150 0.481 32 68 72 24 COMPLETE 151 0.487 80 56 60 48 COMPLETE 152 0.472 92 56 56 68 COMPLETE 153 0.316 68 200 48 20 COMPLETE 154 0.441 84 40 52 196 COMPLETE 155 0.547 76 28 40 60 COMPLETE 156 0.544 72 28 36 56 COMPLETE 157 0.519 60 28 36 60 COMPLETE 158 0.503 60 28 28 56 COMPLETE 159 0.487 76 16 40 52 COMPLETE 160 0.509 64 20 36 64 COMPLETE 161 0.456 72 24 36 60 COMPLETE 162 0.481 64 36 28 44 COMPLETE 163 0.481 72 32 20 60 COMPLETE 164 0.466 68 24 32 52 COMPLETE 165 0.528 56 28 40 68 COMPLETE 166 0.481 56 28 40 64 COMPLETE 167 0.500 76 12 44 36 COMPLETE 168 0.444 80 40 48 48 COMPLETE 169 0.475 60 20 36 44 COMPLETE 170 0.491 28 36 24 72 COMPLETE 171 0.453 64 32 40 56 COMPLETE 172 0.509 72 28 32 28 COMPLETE 173 0.472 52 44 36 68 COMPLETE 174 0.537 60 28 44 56 COMPLETE 175 0.547 56 28 44 60 COMPLETE 176 0.472 44 24 44 52 COMPLETE 177 0.553 60 20 28 60 COMPLETE 178 0.487 56 20 28 64 COMPLETE 179 0.487 52 12 40 56 COMPLETE 180 0.478 68 16 44 48 COMPLETE 181 0.506 60 28 32 60 COMPLETE 182 0.544 48 32 40 68 COMPLETE 183 0.466 48 36 12 68 COMPLETE 184 0.544 40 32 48 40 COMPLETE 185 0.534 36 32 52 40 COMPLETE 186 0.528 40 32 48 32 COMPLETE 187 0.506 44 32 44 36 COMPLETE 188 0.509 36 32 48 40 COMPLETE 189 0.509 48 28 36 36 COMPLETE 190 0.463 40 36 24 32 COMPLETE 191 0.509 40 24 52 44 COMPLETE 192 0.503 32 32 48 28 COMPLETE 193 0.484 36 40 40 24 COMPLETE 194 0.497 44 20 4 40 COMPLETE 195 0.547 36 36 32 56 COMPLETE 196 0.466 40 32 32 52 COMPLETE 197 0.494 28 28 44 60 COMPLETE 198 0.509 40 36 32 64 COMPLETE 199 0.512 36 40 20 52 COMPLETE

study best param: {'KingAttackWeights[2]': 96, 'KingAttackWeights[3]': 36, 'KingAttackWeights[4]': 40, 'KingAttackWeights[5]': 28} study best value: 0.556 study best trial number: 119

Submitted a test on fishtest https://tests.stockfishchess.org/tests/view/5f69bae9938ba4977fe04f41

Edit: Mean of points with a score >54%: 62, 33, 45, 49

fsmosca commented 4 years ago

Thanks for the test. What time control did you use in this optimization?

I can implement the mean approach, but it looks like TPE has also considered the history of trials. There are also interesting parameters under TPE sampler. Will look into it later.

BTW this is the optimization method description used by optuna I found in the code. So TPE is the default, would be interesting to try other Sampler. Other optuna samplers.

        """Optimize an objective function.

        Optimization is done by choosing a suitable set of hyperparameter values from a given
        range. Uses a sampler which implements the task of value suggestion based on a specified
        distribution. The sampler is specified in :func:`~optuna.study.create_study` and the
        default choice for the sampler is TPE.
        See also :class:`~optuna.samplers.TPESampler` for more details on 'TPE'.

        Example:

            .. testcode::

                import optuna

                def objective(trial):
                    x = trial.suggest_uniform("x", -1, 1)
                    return x ** 2

                study = optuna.create_study()
                study.optimize(objective, n_trials=3)

        Args:
            func:
                A callable that implements objective function.
            n_trials:
                The number of trials. If this argument is set to :obj:`None`, there is no
                limitation on the number of trials. If :obj:`timeout` is also set to :obj:`None`,
                the study continues to create trials until it receives a termination signal such
                as Ctrl+C or SIGTERM.
            timeout:
                Stop study after the given number of second(s). If this argument is set to
                :obj:`None`, the study is executed without time limitation. If :obj:`n_trials` is
                also set to :obj:`None`, the study continues to create trials until it receives a
                termination signal such as Ctrl+C or SIGTERM.
            n_jobs:
                The number of parallel jobs. If this argument is set to :obj:`-1`, the number is
                set to CPU count.
            catch:
                A study continues to run even when a trial raises one of the exceptions specified
                in this argument. Default is an empty tuple, i.e. the study will stop for any
                exception except for :class:`~optuna.exceptions.TrialPruned`.
            callbacks:
                List of callback functions that are invoked at the end of each trial. Each function
                must accept two parameters with the following types in this order:
                :class:`~optuna.study.Study` and :class:`~optuna.FrozenTrial`.
            gc_after_trial:
                Flag to determine whether to automatically run garbage collection after each trial.
                Set to :obj:`True` to run the garbage collection, :obj:`False` otherwise.
                When it runs, it runs a full collection by internally calling :func:`gc.collect`.
                If you see an increase in memory consumption over several trials, try setting this
                flag to :obj:`True`.

                .. seealso::

                    :ref:`out-of-memory-gc-collect`

            show_progress_bar:
                Flag to show progress bars or not. To disable progress bar, set this ``False``.
                Currently, progress bar is experimental feature and disabled
                when ``n_jobs`` :math:`\\ne 1`.
        """

Some ref. on Optuna and TPE:

joergoster commented 4 years ago

Time control was the same as in the previous study, 2+0.02. Thank you for the links.

I stii find the idea of re-evaluating the best point so far every n-th iteration very interesting.