A game search and evaluation parameter tuner using optuna framework. The game can be a chess or other game variants. Engine evaluation parameters that can be optimized are piece values like pawn value or knight value and others. Search parameters that can be optimized are futility pruning margin, null move reduction factors and others.
From command line:
PS F:\Tmp> git clone https://github.com/fsmosca/Optuna-Game-Parameter-Tuner.git
See page from wiki.
Instead of installing each module like optuna, plotly and others. Just install with requirements.txt.
See help in wiki.
Or type python tuner.py -h
python tuner.py --sampler name=tpe --engine ./engines/deuterium/deuterium --concurrency 6 --opening-file ./start_opening/ogpt_chess_startpos.epd --opening-format epd --input-param "{'PawnValueEn': {'default':92, 'min':90, 'max':120, 'step':2}, 'BishopValueOp': {'default':350, 'min':290, 'max':350, 'step':3}}" --games-per-trial 24 --plot --base-time-sec 15 --inc-time-sec 0.1 --study-name study1 --pgn-output study1.pgn --trials 100 --common-param "{'Hash': 128}"
Use the flag
--elo-objective
Our objective function result is the result of engine vs engine match. There are engines at fixed depth move control that are deterministic that is if you play the same opening at fixed depth of 2 for 100 games and repeat the same the result of the match is the same. The samplers such as TPE, CMAES, and SKOPT may suggest parameter values that were already suggested before. By default the tuner will not replay the match it will just return the previous result.
There is a flag that play a match for repeated parameter suggestions and it is called --noisy-result
. This is mainly applied when more than one same parameter matches produces different results this is called non-determinisitic or stochastic result. An example situation is when you play a match with a time control instead of fixed depth. Conduct a match #1 at time control of 5s+100ms for 100 games with opening set #1, then do match #2 with opening set #2, most likely the result is not the same. Note that during matches each opening is played twice. In this case it is better to add the --noisy-result
flag in the command line.
An example log when --noisy-result
flag is enabled and sampler repeats suggesting param values. Objective value type is elo with --elo-objective
flag.
starting trial: 149 ...
deterministic function: False
Duplicate suggestion from sampler, {'Pp2': 10, 'Pp6': 3}
Execute engine match as --noisy-result flag is enabled.
suggested param for test engine: {'Pp2': 10, 'Pp6': 3}
param for base engine : {'Pp2': 7, 'Pp6': 2}
common param: {'Hash': 128, 'EvalHash': 4}
init param: {'Pp2': 7, 'Pp6': 2}
init objective value: 0.0
study best param: {'Pp2': 10, 'Pp6': 1}
study best objective value: Elo 124.0
study best trial number: 1
Actual match result: Elo 22.0, CI: [-75.9, +119.4], CL: 95%, G/W/D/L: 32/11/12/9, POV: optimizer
Elo Diff: +21.7, ErrMargin: +/- 97.6, CI: [-75.9, +119.4], LOS: 67.3%, DrawRatio: 37.50%
test param format for match manager: option.Pp2=10 option.Pp6=3
result sent to optimizer: 22.0
elapse: 0h:0m:19s
Trial 149 finished with value: 22.0 and parameters: {'Pp2': 10, 'Pp6': 3}. Best is trial 1 with value: 124.0.
Add a key value pair of 'type': 'float'
--input-param "{'CPuct': {'default':2.147, 'min':1.0, 'max':3.0, 'step':0.05, 'type': 'float'}, 'CPuctBase': {'default':18368.0, 'min':15000.0, 'max':20000.0, 'step':2.0, 'type': 'float'}, 'CPuctFactor': {'default':2.82, 'min':0.5, 'max':3.5, 'step':0.05, 'type': 'float'}, 'FpuValue': {'default':0.443, 'min':-0.1, 'max':1.2, 'step':0.05, 'type': 'float'}, 'PolicyTemperature': {'default':1.61, 'min':0.5, 'max':3.0, 'step':0.05, 'type': 'float'}}"
Trials can be viewed from the dashboard.
pip install optuna-dashboard
Sample command line
Take note on the study-name. We will use that in the dashboard.
python tuner.py --study-name cdrill2000_razor_testpos --sampler name=skopt acquisition_function=LCB --engine "F:/Project/my_cdrill/cdrill2000.exe" ^
--concurrency 4 ^
--opening-file ./start_opening/ogpt_chess_startpos.epd ^
--opening-format epd ^
--input-param "{'RazorMargin': {'default':200, 'min':30, 'max':300, 'step':5}, 'PassedPawnWeight': {'default':100, 'min':30, 'max':500, 'step':5}}" ^
--base-time-sec 10 ^
--inc-time-sec 0.1 ^
--draw-movenumber 30 --draw-movecount 6 --draw-score 0 ^
--resign-movecount 3 --resign-score 500 ^
--games-per-trial 100 --trials 100 --plot ^
--pgn-output cdrill200_razor_test_games.pgn ^
--elo-objective ^
--noisy-result
From the command line the study name is cdrill2000_razor_testpos
. A trial database will be created with the name cdrill2000_razor_testpos.db
The command line to run dashboard is:
optuna-dashboard sqlite:///cdrill2000_razor_testpos.db
(venv) PS F:\Github\Optuna-Game-Parameter-Tuner> optuna-dashboard sqlite:///cdrill2000_razor_testpos.db
Listening on http://127.0.0.1:8080/
Hit Ctrl-C to quit.
Visit http://127.0.0.1:8080/
on your browser.