Getting started -- .binpack files missing?

TonyGuil commented 3 months ago

I am trying to run easy-train.py using easy_train_example.bat. But it expects the three .binpack files to be available somehwhere:

    --training-dataset=c:/dev/nnue-pytorch/noob_master_leaf_static_d12_85M_0.binpack ^
    --training-dataset=c:/dev/nnue-pytorch/d8_100000.binpack ^
    --training-dataset=c:/dev/nnue-pytorch/10m_d3_2.binpack ^

Where can I get these files? And how were they generated?

For background: My ultimate aim is to implement NNUE in my Onitama program. This is perhaps overkill -- as far as I know, my program might already be the best Onitama player in the world -- but I want to find out how NNUE works.

dubslow commented 3 months ago

I dunno much about NNUE but I know that linrock describes his training process, data collection and filtering, and a link to his data in his Stockfish commits, e.g. the most recent one: https://github.com/official-stockfish/Stockfish/commit/1b7dea3f851cd5c5411ba6f07a2f935bfb7da8a9

And for older such commits: https://github.com/official-stockfish/Stockfish/commits/master/?author=linrock (search for "update default" nets, main nets/small nets etc)

Sopel97 commented 3 months ago

The datasets from the example are normally not used, they are just some old small ones that used to be common. Some datasets are linked to from the wiki https://github.com/official-stockfish/nnue-pytorch/wiki/Training-datasets#good-datasets, for others you need to look at commits that introduce new networks from linrock, as the wiki is incomplete at this point

edit. this .sh script is very close to the first training stage for master I think, I used it some time ago when trying to replicate

note that these will not run out of the box, you need to understand every setting here and whether it needs modification to local environment

python3.11 easy_train.py \
    --training-dataset=/data/sopel/nnue/nnue-pytorch-training/data/nodes5000pv2_UHO.binpack \
    --training-dataset=/data/sopel/nnue/nnue-pytorch-training/data/dfrc_n5000.binpack \
    --num-workers=8 \
    --threads=2 \
    --gpus="0,1" \
    --runs-per-gpu=1 \
    --batch-size=16384 \
    --max_epoch=600 \
    --do-network-training=True \
    --do-network-testing=True \
    --tui=True \
    --network-save-period=20 \
    --random-fen-skipping=3 \
    --start-lambda=1.0 \
    --end-lambda=1.0 \
    --fail-on-experiment-exists=True \
    --build-engine-arch=x86-64-bmi2 \
    --build-threads=32 \
    --epoch-size=100000000 \
    --validation-size=1000000 \
    --network-testing-threads=24 \
    --network-testing-explore-factor=1.5 \
    --network-testing-book="https://github.com/official-stockfish/books/blob/master/UHO_XXL_%2B0.90_%2B1.19.epd.zip" \
    --network-testing-nodes-per-move=20000 \
    --network-testing-hash-mb=8 \
    --network-testing-games-per-round=200 \
    --engine-base-branch=Sopel97/Stockfish/experiment_502 \
    --engine-test-branch=Sopel97/Stockfish/experiment_502 \
    --nnue-pytorch-branch=Sopel97/nnue-pytorch/experiment_502 \
    --workspace-path=./easy_train_data \
    --experiment-name=502_s1 \
    --features="HalfKAv2_hm^"

and this is how you'd run a retraining session

python3.11 easy_train.py \
    --training-dataset=/data/sopel/nnue/nnue-pytorch-training/data/T60T70wIsRightFarseerT60T74T75T76.binpack \
    --num-workers=16 \
    --threads=2 \
    --gpus="0,1" \
    --runs-per-gpu=1 \
    --start-from-experiment=502_s1 \
    --batch-size=16384 \
    --max_epoch=600 \
    --do-network-training=True \
    --do-network-testing=True \
    --tui=True \
    --network-save-period=20 \
    --random-fen-skipping=10 \
    --start-lambda=1.0 \
    --end-lambda=0.75 \
    --fail-on-experiment-exists=True \
    --build-engine-arch=x86-64-bmi2 \
    --build-threads=32 \
    --epoch-size=100000000 \
    --validation-size=1000000 \
    --network-testing-threads=24 \
    --network-testing-explore-factor=1.5 \
    --network-testing-book="https://github.com/official-stockfish/books/blob/master/UHO_XXL_%2B0.90_%2B1.19.epd.zip" \
    --network-testing-nodes-per-move=20000 \
    --network-testing-hash-mb=8 \
    --network-testing-games-per-round=200 \
    --engine-base-branch=Sopel97/Stockfish/experiment_502 \
    --engine-test-branch=Sopel97/Stockfish/experiment_502 \
    --nnue-pytorch-branch=Sopel97/nnue-pytorch/experiment_502 \
    --workspace-path=./easy_train_data \
    --experiment-name=502_s2 \
    --features="HalfKAv2_hm^"

official-stockfish / nnue-pytorch

Getting started -- .binpack files missing? #281