Base engine with default values

joergoster commented 4 years ago

@fsmosca Hi, can you please make a version with a non-changing base engine, always initialized with the default values? I would like to give this a try instead of the current 'moving target'.

fsmosca commented 4 years ago

Yes I will do that.

Sample study session:

Optuna Game Parameter Tuner v0.3.0

python -u tuner.py --engine .\engines\deuterium\deuterium.exe --hash 128 --opening-file .\start_opening\ogpt_chess_startpos.epd --games-per-trial 50 --concurrency 6 --plot --study-name pv10 --fix-base-param --base-time-sec 10
trials: 1000, games_per_trial: 50
input param: OrderedDict([('PawnValueEn', {'default': 92, 'min': 90, 'max': 120, 'step': 2}), ('BishopValueOp', {'default': 350, 'min': 300, 'max': 360, 'step': 3}), ('BishopValueEn', {'default': 350, 'min': 300, 'max': 360, 'step': 3}), ('RookValueEn', {'default': 525, 'min': 490, 'max': 550, 'step': 5}), ('QueenValueOp', {'default': 985, 'min': 975, 'max': 1050, 'step': 5}), ('MobilityWeight', {'default': 100, 'min': 50, 'max': 150, 'step': 4})])

[I 2020-09-20 11:09:19,142] A new study created in RDB with name: pv10
Warning, best value from previous trial is not found!
study best value: 0.0
Warning, best param from previous trial is not found!.
study best param: {}

starting trial: 0 ...
suggested param for test engine: {'PawnValueEn': 102, 'BishopValueOp': 333, 'BishopValueEn': 342, 'RookValueEn': 520, 'QueenValueOp': 1000, 'MobilityWeight': 142}
param for base engine          : {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init param: {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init value: 0.5
study best param: {}
study best value: 0.0
Actual match result: 0.48, point of view: optimizer suggested values
[I 2020-09-20 11:12:11,851] Trial 0 finished with value: 0.48 and parameters: {'PawnValueEn': 102, 'BishopValueOp': 333, 'BishopValueEn': 342, 'RookValueEn': 520, 'QueenValueOp': 1000, 'MobilityWeight': 142}. Best is trial 0 with value: 0.48.

starting trial: 1 ...
suggested param for test engine: {'PawnValueEn': 118, 'BishopValueOp': 360, 'BishopValueEn': 357, 'RookValueEn': 505, 'QueenValueOp': 990, 'MobilityWeight': 90}
param for base engine          : {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init param: {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init value: 0.5
study best param: {'PawnValueEn': 102, 'BishopValueOp': 333, 'BishopValueEn': 342, 'RookValueEn': 520, 'QueenValueOp': 1000, 'MobilityWeight': 142}
study best value: 0.48
Actual match result: 0.54, point of view: optimizer suggested values
[I 2020-09-20 11:15:19,208] Trial 1 finished with value: 0.54 and parameters: {'PawnValueEn': 118, 'BishopValueOp': 360, 'BishopValueEn': 357, 'RookValueEn': 505, 'QueenValueOp': 990, 'MobilityWeight': 90}. Best is trial 1 with value: 0.54.

starting trial: 2 ...
suggested param for test engine: {'PawnValueEn': 98, 'BishopValueOp': 318, 'BishopValueEn': 327, 'RookValueEn': 495, 'QueenValueOp': 990, 'MobilityWeight': 66}
param for base engine          : {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init param: {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init value: 0.5
study best param: {'PawnValueEn': 118, 'BishopValueOp': 360, 'BishopValueEn': 357, 'RookValueEn': 505, 'QueenValueOp': 990, 'MobilityWeight': 90}
study best value: 0.54
Actual match result: 0.52, point of view: optimizer suggested values
[I 2020-09-20 11:18:15,684] Trial 2 finished with value: 0.52 and parameters: {'PawnValueEn': 98, 'BishopValueOp': 318, 'BishopValueEn': 327, 'RookValueEn': 495, 'QueenValueOp': 990, 'MobilityWeight': 66}. Best is trial 1 with value: 0.54.

starting trial: 3 ...

...

starting trial: 9 ...
suggested param for test engine: {'PawnValueEn': 114, 'BishopValueOp': 357, 'BishopValueEn': 339, 'RookValueEn': 520, 'QueenValueOp': 1000, 'MobilityWeight': 62}
param for base engine          : {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init param: {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init value: 0.5
study best param: {'PawnValueEn': 118, 'BishopValueOp': 360, 'BishopValueEn': 357, 'RookValueEn': 505, 'QueenValueOp': 990, 'MobilityWeight': 90}
study best value: 0.54
Actual match result: 0.57, point of view: optimizer suggested values
[I 2020-09-20 11:39:47,624] Trial 9 finished with value: 0.57 and parameters: {'PawnValueEn': 114, 'BishopValueOp': 357, 'BishopValueEn': 339, 'RookValueEn': 520, 'QueenValueOp': 1000, 'MobilityWeight': 62}. Best is trial 9 with value: 0.57.
Saving plots ...
Done saving plots.
 number  value  params_BishopValueEn  params_BishopValueOp  params_MobilityWeight  params_PawnValueEn  params_QueenValueOp  params_RookValueEn     state
      0   0.48                   342                   333                    142                 102                 1000                 520  COMPLETE
      1   0.54                   357                   360                     90                 118                  990                 505  COMPLETE
      2   0.52                   327                   318                     66                  98                  990                 495  COMPLETE
      3   0.37                   315                   321                    134                 110                 1045                 495  COMPLETE
      4   0.44                   315                   330                    146                 112                  985                 520  COMPLETE
      5   0.42                   315                   330                    134                 112                  980                 545  COMPLETE
      6   0.50                   348                   333                    118                 120                 1020                 515  COMPLETE
      7   0.52                   333                   300                     54                 104                  995                 510  COMPLETE
      8   0.46                   360                   309                    122                 110                 1015                 490  COMPLETE
      9   0.57                   339                   357                     62                 114                 1000                 520  COMPLETE

study best param: {'BishopValueEn': 339, 'BishopValueOp': 357, 'MobilityWeight': 62, 'PawnValueEn': 114, 'QueenValueOp': 1000, 'RookValueEn': 520}
study best value: 0.57
study best trial number: 9
[I 2020-09-20 11:39:55,027] Using an existing study with name 'pv10' instead of creating a new one.
study best value: 0.57
study best param: {'BishopValueEn': 339, 'BishopValueOp': 357, 'MobilityWeight': 62, 'PawnValueEn': 114, 'QueenValueOp': 1000, 'RookValueEn': 520}

starting trial: 10 ...
suggested param for test engine: {'PawnValueEn': 94, 'BishopValueOp': 357, 'BishopValueEn': 300, 'RookValueEn': 535, 'QueenValueOp': 1030, 'MobilityWeight': 82}
param for base engine          : {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init param: {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init value: 0.5
study best param: {'BishopValueEn': 339, 'BishopValueOp': 357, 'MobilityWeight': 62, 'PawnValueEn': 114, 'QueenValueOp': 1000, 'RookValueEn': 520}
study best value: 0.57
Actual match result: 0.49, point of view: optimizer suggested values
[I 2020-09-20 11:43:02,186] Trial 10 finished with value: 0.49 and parameters: {'PawnValueEn': 94, 'BishopValueOp': 357, 'BishopValueEn': 300, 'RookValueEn': 535, 'QueenValueOp': 1030, 'MobilityWeight': 82}. Best is trial 9 with value: 0.57.

starting trial: 11 ...

...

starting trial: 24 ...
suggested param for test engine: {'PawnValueEn': 114, 'BishopValueOp': 354, 'BishopValueEn': 348, 'RookValueEn': 525, 'QueenValueOp': 1005, 'MobilityWeight': 78}
param for base engine          : {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init param: {'PawnValueEn': 92, 'BishopValueOp': 350, 'BishopValueEn': 350, 'RookValueEn': 525, 'QueenValueOp': 985, 'MobilityWeight': 100}
init value: 0.5
study best param: {'BishopValueEn': 339, 'BishopValueOp': 357, 'MobilityWeight': 62, 'PawnValueEn': 114, 'QueenValueOp': 1000, 'RookValueEn': 520}
study best value: 0.57
Actual match result: 0.62, point of view: optimizer suggested values
[I 2020-09-20 12:24:51,024] Trial 24 finished with value: 0.62 and parameters: {'PawnValueEn': 114, 'BishopValueOp': 354, 'BishopValueEn': 348, 'RookValueEn': 525, 'QueenValueOp': 1005, 'MobilityWeight': 78}. Best is trial 24 with value: 0.62.

Game test

The optimizer values won by a small margin. TC=15s+100ms.

Score of deuterium_pv10_trial_24 vs deuterium_default: 103 - 78 - 219  [0.531] 400
...      deuterium_pv10_trial_24 playing White: 54 - 37 - 109  [0.542] 200
...      deuterium_pv10_trial_24 playing Black: 49 - 41 - 110  [0.520] 200
...      White vs Black: 95 - 86 - 219  [0.511] 400
Elo difference: 21.7 +/- 22.9, LOS: 96.8 %, DrawRatio: 54.8 %

joergoster commented 4 years ago

Thank you!

So it seems this already worked for you. I will now start a tuning run for SF's KingAttackWeights. I will let you know the result.

There is one culprit remaining, though. Just like all BO implementations I know, Optuna also doesn't handle noisy evaluations. Maybe I'll open an issue about it on their github page.

One way to deal with this is to keep track of the best evaluated points and return the mean of them instead of simply the best point. I already mentioned it here https://github.com/thomasahle/fastchess/issues/24 back then ...

fsmosca commented 4 years ago

Tuning

I did the kingattackweights but only 4 options on a fast tc of 2s+50ms at 100 games per trial.

python -u tuner7.py --engine .\engines\stockfish-modern\stockfish.exe --hash 128 --opening-file .\start_opening\ogpt_chess_startpos.epd --games-per-trial 100 --concurrency 7 --plot --study-name sf_eval_kaw5 --fix-base-param --base-time-sec 2 --pgn-output sf_eval_kaw5.pgn
trials: 1000, games_per_trial: 100
input param: OrderedDict([('KingAttackWeights[2]', {'default': 81, 'min': 30, 'max': 130, 'step': 4}), ('KingAttackWeights[3]', {'default': 52, 'min': 10, 'max': 110, 'step': 4}), ('KingAttackWeights[4]', {'default': 44, 'min': 10, 'max': 110, 'step': 4}), ('KingAttackWeights[5]', {'default': 10, 'min': 0, 'max': 60, 'step': 4})])

[I 2020-09-20 20:33:10,339] A new study created in RDB with name: sf_eval_kaw5
Warning, best value from previous trial is not found!
study best value: 0.0
Warning, best param from previous trial is not found!.
study best param: {}

starting trial: 0 ...
suggested param for test engine: {'KingAttackWeights[2]': 70, 'KingAttackWeights[3]': 78, 'KingAttackWeights[4]': 102, 'KingAttackWeights[5]': 12}
param for base engine          : {'KingAttackWeights[2]': 81, 'KingAttackWeights[3]': 52, 'KingAttackWeights[4]': 44, 'KingAttackWeights[5]': 10}
init param: {'KingAttackWeights[2]': 81, 'KingAttackWeights[3]': 52, 'KingAttackWeights[4]': 44, 'KingAttackWeights[5]': 10}
init value: 0.5
study best param: {}
study best value: 0.0
Actual match result: 0.45, point of view: optimizer suggested values
[I 2020-09-20 20:35:34,609] Trial 0 finished with value: 0.45 and parameters: {'KingAttackWeights[2]': 70, 'KingAttackWeights[3]': 78, 'KingAttackWeights[4]': 102, 'KingAttackWeights[5]': 12}. Best is trial 0 with value: 0.45.

starting trial: 1 ...
suggested param for test engine: {'KingAttackWeights[2]': 38, 'KingAttackWeights[3]': 70, 'KingAttackWeights[4]': 46, 'KingAttackWeights[5]': 32}
param for base engine          : {'KingAttackWeights[2]': 81, 'KingAttackWeights[3]': 52, 'KingAttackWeights[4]': 44, 'KingAttackWeights[5]': 10}
init param: {'KingAttackWeights[2]': 81, 'KingAttackWeights[3]': 52, 'KingAttackWeights[4]': 44, 'KingAttackWeights[5]': 10}
init value: 0.5
study best param: {'KingAttackWeights[2]': 70, 'KingAttackWeights[3]': 78, 'KingAttackWeights[4]': 102, 'KingAttackWeights[5]': 12}
study best value: 0.45
Actual match result: 0.535, point of view: optimizer suggested values
[I 2020-09-20 20:37:54,483] Trial 1 finished with value: 0.535 and parameters: {'KingAttackWeights[2]': 38, 'KingAttackWeights[3]': 70, 'KingAttackWeights[4]': 46, 'KingAttackWeights[5]': 32}. Best is trial 1 with value: 0.535.

...

starting trial: 10 ...
suggested param for test engine: {'KingAttackWeights[2]': 130, 'KingAttackWeights[3]': 14, 'KingAttackWeights[4]': 18, 'KingAttackWeights[5]': 16}
param for base engine          : {'KingAttackWeights[2]': 81, 'KingAttackWeights[3]': 52, 'KingAttackWeights[4]': 44, 'KingAttackWeights[5]': 10}
init param: {'KingAttackWeights[2]': 81, 'KingAttackWeights[3]': 52, 'KingAttackWeights[4]': 44, 'KingAttackWeights[5]': 10}
init value: 0.5
study best param: {'KingAttackWeights[2]': 38, 'KingAttackWeights[3]': 70, 'KingAttackWeights[4]': 46, 'KingAttackWeights[5]': 32}
study best value: 0.535
Actual match result: 0.56, point of view: optimizer suggested values
[I 2020-09-20 21:00:01,173] Trial 10 finished with value: 0.56 and parameters: {'KingAttackWeights[2]': 130, 'KingAttackWeights[3]': 14, 'KingAttackWeights[4]': 18, 'KingAttackWeights[5]': 16}. Best is trial 10 with value: 0.56.

Game test

At trial 10 it gets 0.56 from a 100-game trial, I tested this in game test at 5s+100ms. Looks ok to me it does not go down after 1000 games. I have limited resources, I cannot test this further.

Score of sf_kaw5_trial_10 vs sf_default: 234 - 230 - 536  [0.502] 1000
...      sf_kaw5_trial_10 playing White: 125 - 105 - 270  [0.520] 500
...      sf_kaw5_trial_10 playing Black: 109 - 125 - 266  [0.484] 500
...      White vs Black: 250 - 214 - 536  [0.518] 1000
Elo difference: 1.4 +/- 14.7, LOS: 57.4 %, DrawRatio: 53.6 %

Looking forward on your test results.

joergoster commented 4 years ago

My tuning try has finished as well.

python tuner-kingattackweights.py --fix-base-param --engine ./engines/stockfish.exe --opening-file ./start_opening/ogpt_chess_startpos.epd --trials 400 --games-per-trial 20 --concurrency 3 --base-time-sec 2 --inc-time-sec 0.02 --study-name KingAttackWeights2 --plot

The result:

study best param: {'KingAttackWeights[2]': 20, 'KingAttackWeights[3]': 0, 'KingAttackWeights[4]': 0, 'KingAttackWeights[5]': 74} study best value: 0.775 study best trial number: 217

Not sure I will even validate those values since they look way off.

fsmosca commented 4 years ago

Perhaps this one is too small: --games-per-trial 20

I was using 100.

BTW what parameter range and steps did you use? I used:

input param: OrderedDict([('KingAttackWeights[2]', {'default': 81, 'min': 30, 'max': 130, 'step': 4}), ('KingAttackWeights[3]', {'default': 52, 'min': 10, 'max': 110, 'step': 4}), ('KingAttackWeights[4]', {'default': 44, 'min': 10, 'max': 110, 'step': 4}), ('KingAttackWeights[5]', {'default': 10, 'min': 0, 'max': 60, 'step': 4})])

joergoster commented 4 years ago

This is what I used:

    input_param.update({'KingAttackWeights[2]': {'default': 2, 'min': 0, 'max': 200, 'step': 2}})
    input_param.update({'KingAttackWeights[3]': {'default': 2, 'min': 0, 'max': 200, 'step': 2}})
    input_param.update({'KingAttackWeights[4]': {'default': 2, 'min': 0, 'max': 200, 'step': 2}})
    input_param.update({'KingAttackWeights[5]': {'default': 2, 'min': 0, 'max': 200, 'step': 2}})

And yes, 20 games per trial is very likely simply not enough.

joergoster commented 4 years ago

I started another run for the KingAttackWeights, this time with a step size of 4 for each parameter, 200 trials and with 160 games per trial. Will post the result when it is finished.

Btw., do you know this interesting discussion https://github.com/glinscott/fishtest/issues/774?

fsmosca commented 4 years ago

I started another run for the KingAttackWeights, this time with a step size of 4 for each parameter, 200 trials and with 160 games per trial. Will post the result when it is finished.

Ok nice.

Btw., do you know this interesting discussion glinscott/fishtest#774?

No, just know it. Thanks.

joergoster commented 4 years ago

Tuning finished.

[I 2020-09-22 09:53:20,011] Trial Saving plots ... Done saving plots. number value params_KingAttackWeights[2] 0 0.225 32 1 0.362 0 2 0.234 28 3 0.263 80 4 0.519 24 5 0.306 88 6 0.447 128 7 0.362 152 8 0.447 60 9 0.275 8 10 0.350 192 11 0.491 136 12 0.456 136 13 0.356 180 14 0.459 112 15 0.447 164 16 0.472 60 17 0.394 112 18 0.322 148 19 0.506 60 20 0.478 40 21 0.509 56 22 0.494 60 23 0.475 16 24 0.484 44 25 0.509 72 26 0.534 76 27 0.491 84 28 0.459 100 29 0.272 24 30 0.453 72 31 0.512 48 32 0.438 40 33 0.431 4 34 0.500 44 35 0.456 24 36 0.459 52 37 0.463 76 38 0.225 96 39 0.506 32 40 0.531 68 41 0.491 68 42 0.494 92 43 0.503 32 44 0.409 52 45 0.397 84 46 0.472 68 47 0.512 16 48 0.403 12 49 0.456 0 50 0.466 20 51 0.500 48 52 0.503 108 53 0.528 32 54 0.491 32 55 0.512 8 56 0.459 8 57 0.534 36 58 0.425 0 59 0.478 36 60 0.491 20 61 0.438 12 62 0.500 28 63 0.506 60 64 0.503 40 65 0.422 52 66 0.519 64 67 0.503 64 68 0.500 80 69 0.503 76 70 0.503 48 71 0.428 24 72 0.509 16 73 0.469 40 74 0.438 64 75 0.506 8 76 0.466 56 77 0.509 44 78 0.512 88 79 0.484 36 80 0.459 92 81 0.475 80 82 0.428 100 83 0.509 68 84 0.431 124 85 0.506 84 86 0.412 24 87 0.466 28 88 0.453 16 89 0.434 48 90 0.397 56 91 0.356 76 92 0.438 4 93 0.456 36 94 0.516 88 95 0.459 88 96 0.463 20 97 0.463 104 98 0.478 92 99 0.528 72 100 0.469 72 101 0.522 68 102 0.506 64 103 0.484 56 104 0.463 80 105 0.541 72 106 0.494 68 107 0.500 72 108 0.497 76 109 0.359 84 110 0.494 88 111 0.463 60 112 0.497 68 113 0.472 72 114 0.506 52 115 0.494 96 116 0.547 80 117 0.531 80 118 0.512 80 119 0.556 96 120 0.494 112 121 0.512 84 122 0.506 96 123 0.469 64 124 0.444 76 125 0.472 92 126 0.472 80 127 0.481 104 128 0.506 72 129 0.434 64 130 0.519 84 131 0.503 88 132 0.509 76 133 0.500 84 134 0.472 100 135 0.466 68 136 0.444 200 137 0.472 80 138 0.522 88 139 0.478 72 140 0.463 92 141 0.466 88 142 0.519 96 143 0.503 104 144 0.512 84 145 0.388 108 146 0.487 96 147 0.500 76 148 0.506 116 149 0.500 96 150 0.481 32 151 0.487 80 152 0.472 92 153 0.316 68 154 0.441 84 155 0.547 76 156 0.544 72 157 0.519 60 158 0.503 60 159 0.487 76 160 0.509 64 161 0.456 72 162 0.481 64 163 0.481 72 164 0.466 68 165 0.528 56 166 0.481 56 167 0.500 76 168 0.444 80 169 0.475 60 170 0.491 28 171 0.453 64 172 0.509 72 173 0.472 52 174 0.537 60 175 0.547 56 176 0.472 44 177 0.553 60 178 0.487 56 179 0.487 52 180 0.478 68 181 0.506 60 182 0.544 48 183 0.466 48 184 0.544 40 185 0.534 36 186 0.528 40 187 0.506 44 188 0.509 36 189 0.509 48 190 0.463 40 191 0.509 40 192 0.503 32 193 0.484 36 194 0.497 44 195 0.547 36 196 0.466 40 197 0.494 28 198 0.509 40 199 0.512 36 199 finished with value: 0.512 and parameters: {'KingAttackWeights[2]': 36, 'KingAttackWeights[3]': 40, 'KingAttackWeights[4]': 20, 'KingAttackWeights[5]': 52}. Best is trial 119 with value: 0.556. params_KingAttackWeights[3] params_KingAttackWeights[4] params_KingAttackWeights[5] state 172 112 120 COMPLETE 172 88 72 COMPLETE 172 104 96 COMPLETE 136 144 104 COMPLETE 72 60 104 COMPLETE 176 124 32 COMPLETE 88 76 28 COMPLETE 84 184 48 COMPLETE 88 0 148 COMPLETE 128 96 200 COMPLETE 12 28 172 COMPLETE 40 56 8 COMPLETE 36 44 4 COMPLETE 48 52 136 COMPLETE 0 0 80 COMPLETE 52 64 68 COMPLETE 64 20 0 COMPLETE 16 32 148 COMPLETE 108 68 180 COMPLETE 32 8 116 COMPLETE 68 12 120 COMPLETE 32 48 96 COMPLETE 24 40 96 COMPLETE 64 0 116 COMPLETE 24 76 84 COMPLETE 0 16 56 COMPLETE 4 28 56 COMPLETE 0 24 56 COMPLETE 4 32 40 COMPLETE 196 120 16 COMPLETE 120 16 60 COMPLETE 12 52 76 COMPLETE 20 48 80 COMPLETE 52 60 100 COMPLETE 12 88 88 COMPLETE 76 80 72 COMPLETE 104 40 108 COMPLETE 8 140 48 COMPLETE 152 200 24 COMPLETE 0 104 64 COMPLETE 44 32 48 COMPLETE 44 36 40 COMPLETE 60 8 48 COMPLETE 32 52 88 COMPLETE 80 64 128 COMPLETE 92 88 72 COMPLETE 24 24 36 COMPLETE 36 72 92 COMPLETE 56 72 108 COMPLETE 72 84 76 COMPLETE 44 56 24 COMPLETE 36 96 92 COMPLETE 12 28 52 COMPLETE 16 16 64 COMPLETE 24 44 64 COMPLETE 40 68 68 COMPLETE 48 36 68 COMPLETE 32 56 80 COMPLETE 92 60 44 COMPLETE 56 4 32 COMPLETE 32 24 108 COMPLETE 40 68 84 COMPLETE 28 68 60 COMPLETE 16 48 76 COMPLETE 8 56 56 COMPLETE 16 32 100 COMPLETE 44 40 68 COMPLETE 68 20 80 COMPLETE 52 44 92 COMPLETE 44 12 68 COMPLETE 20 40 76 COMPLETE 36 76 84 COMPLETE 40 60 60 COMPLETE 20 48 52 COMPLETE 8 36 92 COMPLETE 60 80 64 COMPLETE 28 52 100 COMPLETE 4 16 112 COMPLETE 28 28 128 COMPLETE 48 36 44 COMPLETE 28 28 140 COMPLETE 12 20 164 COMPLETE 36 44 124 COMPLETE 4 32 72 COMPLETE 40 96 52 COMPLETE 20 52 132 COMPLETE 64 64 68 COMPLETE 52 72 80 COMPLETE 16 60 88 COMPLETE 44 112 60 COMPLETE 36 72 104 COMPLETE 116 28 152 COMPLETE 32 40 76 COMPLETE 28 8 120 COMPLETE 56 56 72 COMPLETE 56 56 64 COMPLETE 80 64 56 COMPLETE 48 40 96 COMPLETE 68 24 72 COMPLETE 56 48 44 COMPLETE 60 48 36 COMPLETE 24 56 44 COMPLETE 48 56 40 COMPLETE 56 52 48 COMPLETE 44 68 32 COMPLETE 40 80 24 COMPLETE 40 80 16 COMPLETE 72 44 16 COMPLETE 52 64 44 COMPLETE 64 168 24 COMPLETE 24 32 28 COMPLETE 28 48 52 COMPLETE 32 80 4 COMPLETE 8 12 60 COMPLETE 16 56 84 COMPLETE 24 20 48 COMPLETE 36 36 36 COMPLETE 36 36 32 COMPLETE 44 36 36 COMPLETE 36 40 28 COMPLETE 36 40 28 COMPLETE 48 44 12 COMPLETE 40 32 20 COMPLETE 52 36 32 COMPLETE 32 24 40 COMPLETE 60 52 44 COMPLETE 32 16 20 COMPLETE 36 44 28 COMPLETE 44 28 36 COMPLETE 20 60 56 COMPLETE 56 48 48 COMPLETE 56 48 48 COMPLETE 52 40 40 COMPLETE 72 36 32 COMPLETE 48 52 44 COMPLETE 40 32 56 COMPLETE 24 44 36 COMPLETE 88 56 52 COMPLETE 64 60 24 COMPLETE 64 40 20 COMPLETE 32 48 24 COMPLETE 76 60 64 COMPLETE 56 60 12 COMPLETE 64 64 4 COMPLETE 60 24 12 COMPLETE 140 88 12 COMPLETE 44 32 28 COMPLETE 48 52 40 COMPLETE 36 68 0 COMPLETE 0 44 32 COMPLETE 68 72 24 COMPLETE 56 60 48 COMPLETE 56 56 68 COMPLETE 200 48 20 COMPLETE 40 52 196 COMPLETE 28 40 60 COMPLETE 28 36 56 COMPLETE 28 36 60 COMPLETE 28 28 56 COMPLETE 16 40 52 COMPLETE 20 36 64 COMPLETE 24 36 60 COMPLETE 36 28 44 COMPLETE 32 20 60 COMPLETE 24 32 52 COMPLETE 28 40 68 COMPLETE 28 40 64 COMPLETE 12 44 36 COMPLETE 40 48 48 COMPLETE 20 36 44 COMPLETE 36 24 72 COMPLETE 32 40 56 COMPLETE 28 32 28 COMPLETE 44 36 68 COMPLETE 28 44 56 COMPLETE 28 44 60 COMPLETE 24 44 52 COMPLETE 20 28 60 COMPLETE 20 28 64 COMPLETE 12 40 56 COMPLETE 16 44 48 COMPLETE 28 32 60 COMPLETE 32 40 68 COMPLETE 36 12 68 COMPLETE 32 48 40 COMPLETE 32 52 40 COMPLETE 32 48 32 COMPLETE 32 44 36 COMPLETE 32 48 40 COMPLETE 28 36 36 COMPLETE 36 24 32 COMPLETE 24 52 44 COMPLETE 32 48 28 COMPLETE 40 40 24 COMPLETE 20 4 40 COMPLETE 36 32 56 COMPLETE 32 32 52 COMPLETE 28 44 60 COMPLETE 36 32 64 COMPLETE 40 20 52 COMPLETE

study best param: {'KingAttackWeights[2]': 96, 'KingAttackWeights[3]': 36, 'KingAttackWeights[4]': 40, 'KingAttackWeights[5]': 28} study best value: 0.556 study best trial number: 119

Submitted a test on fishtest https://tests.stockfishchess.org/tests/view/5f69bae9938ba4977fe04f41

Edit: Mean of points with a score >54%: 62, 33, 45, 49

fsmosca commented 4 years ago

Thanks for the test. What time control did you use in this optimization?

I can implement the mean approach, but it looks like TPE has also considered the history of trials. There are also interesting parameters under TPE sampler. Will look into it later.

BTW this is the optimization method description used by optuna I found in the code. So TPE is the default, would be interesting to try other Sampler. Other optuna samplers.

        """Optimize an objective function.

        Optimization is done by choosing a suitable set of hyperparameter values from a given
        range. Uses a sampler which implements the task of value suggestion based on a specified
        distribution. The sampler is specified in :func:`~optuna.study.create_study` and the
        default choice for the sampler is TPE.
        See also :class:`~optuna.samplers.TPESampler` for more details on 'TPE'.

        Example:

            .. testcode::

                import optuna

                def objective(trial):
                    x = trial.suggest_uniform("x", -1, 1)
                    return x ** 2

                study = optuna.create_study()
                study.optimize(objective, n_trials=3)

        Args:
            func:
                A callable that implements objective function.
            n_trials:
                The number of trials. If this argument is set to :obj:`None`, there is no
                limitation on the number of trials. If :obj:`timeout` is also set to :obj:`None`,
                the study continues to create trials until it receives a termination signal such
                as Ctrl+C or SIGTERM.
            timeout:
                Stop study after the given number of second(s). If this argument is set to
                :obj:`None`, the study is executed without time limitation. If :obj:`n_trials` is
                also set to :obj:`None`, the study continues to create trials until it receives a
                termination signal such as Ctrl+C or SIGTERM.
            n_jobs:
                The number of parallel jobs. If this argument is set to :obj:`-1`, the number is
                set to CPU count.
            catch:
                A study continues to run even when a trial raises one of the exceptions specified
                in this argument. Default is an empty tuple, i.e. the study will stop for any
                exception except for :class:`~optuna.exceptions.TrialPruned`.
            callbacks:
                List of callback functions that are invoked at the end of each trial. Each function
                must accept two parameters with the following types in this order:
                :class:`~optuna.study.Study` and :class:`~optuna.FrozenTrial`.
            gc_after_trial:
                Flag to determine whether to automatically run garbage collection after each trial.
                Set to :obj:`True` to run the garbage collection, :obj:`False` otherwise.
                When it runs, it runs a full collection by internally calling :func:`gc.collect`.
                If you see an increase in memory consumption over several trials, try setting this
                flag to :obj:`True`.

                .. seealso::

                    :ref:`out-of-memory-gc-collect`

            show_progress_bar:
                Flag to show progress bars or not. To disable progress bar, set this ``False``.
                Currently, progress bar is experimental feature and disabled
                when ``n_jobs`` :math:`\\ne 1`.
        """

Some ref. on Optuna and TPE:

joergoster commented 4 years ago

Time control was the same as in the previous study, 2+0.02. Thank you for the links.

I stii find the idea of re-evaluating the best point so far every n-th iteration very interesting.

fsmosca / Optuna-Game-Parameter-Tuner