official-stockfish / fishtest

The Stockfish testing framework
https://tests.stockfishchess.org/tests
281 stars 129 forks source link

repeated games from books #478

Closed vondele closed 4 years ago

vondele commented 4 years ago

as is being discussed in https://github.com/official-stockfish/Stockfish/issues/2283 even with a large book, openings can be repeated in fishtest. This is due to cutechess being instructed to pick random openings, in independent cutechess runs with different seeds, which can lead to repetitions.

One way to solve this would be to invoke cutechess with the order=sequential option, and give each batch a different starting point with the start=N option. E.g. N=batchId * 250/2

-openings file=short.pgn format=pgn order=sequential plies=16 start=N

However, this doesn't work out of the box:

If we anyway have a random opening from the set, it occurs to me we could get the needed functionality in cutechess. If order=randomwould allow for a start=N to specify the location in the random stream, we could easily just use the right starting point in the stream (using the same seed for all workers)

ppigazzini commented 4 years ago

@vondele the new cutechess-cli is only 2 months old. Have you tested that version? ps: I'm away from my keyboard

noobpwnftw commented 4 years ago

Another workaround is to generate opening set per task by picking and shuffling the lines in the chosen opening book from the worker side, then launch cutechess with each of those intermediate books.

vondele commented 4 years ago

@ppigazzini compiling qt5 right now :-) let's see if I manage to follow your instructions and build cutechess-cli from sources.

@noobpwnftw yes that would work, and be rather easy for the epd format. For pgn we would need some kind of a basic parser, or convert those to epd as well.

ppigazzini commented 4 years ago

@vondele expect to wait some time with a desktop CPU :) I uploaded on the "books" repo the cutechess-cli built 2 months ago for variant SF (I preferred to use a well tested build in this first switch). I tested only the start of cutechess-cli on recent version of several linux distro, view the list on the wiki page. Some distro (Debian and Gentoo if I recall correctly) required to install a missing package.

vondele commented 4 years ago

3h for qt5 on my (old) desktop :) but I do have a working version of cutechess-cli now.

Thanks a lot for your detailed instructions on the wiki...

vondele commented 4 years ago

I've made a PR to cutechess that should result in the functionality we need. Basically we can ask for

-openings file=book.pgn format=pgn order=random plies=16 start=BatchID * BatchSize / 2 

and this will make sure all workes have unique and random batch of openings. If start will be larger than the number of openings in the file, things will just wrap.

-srand random(taskId)

should be used so the same random seed is used by all workers, for a given task, but different seeds should be used for each task.

vdbergh commented 4 years ago

Thinking more about it, I feel vaguely uncomfortable about this. For this to work I think it is safer that the books are randomized. Otherwise one creates some type of correlation (imagine a tidy book author ordering the positions by bias).

Maybe we should try to understand what the statistical effect is of an occasional repeated game on our testing procedure. It might be negligible. We can do this by simulation.

vdbergh commented 4 years ago

Maybe it is better for cutechess-cli to apply internally a pseudo-random permutation (depending on a seed) to the book indices. Then one can just give consecutive segments to every task. This is similar to the suggestion by noob, except it would be done inside cutechess-cli.

vdbergh commented 4 years ago

Ah maybe this is already in your PR since I see it now specifies order=random.... In that case sorry for the noise :)

vondele commented 4 years ago

yes, that's in the PR.

I now actually also understand that the random permutation in cutechess is not really fully random... it should explain the 'strange' birthday-paradox numbers we observed in the other test. I'll see if I can fix that as well.

vondele commented 4 years ago

now that we have a new release of cutechess, we could tackle this one. I looked in the code, but the server part is too tricky for me. Basically, the worker needs to get two pieces of information from the server:

@tomtor is that something you can help with (not urgent)?

tomtor commented 4 years ago
* A random seed, all tasks of the same test need to receive the same random seed

This can be derived worker side from the run_id? See: https://github.com/glinscott/fishtest/blob/772699a1acce2213a8de3b9b44988e910400ede4/worker/worker.py#L172

* A integer that specifies the index in the sequence of games. I.e. batchID * games_per_batch + games_already_played_in_this_batch.

batchid is in task_id in the worker (same line)

and games_already_played is also available in the worker? @vondele So not sure what is needed from the server?

vondele commented 4 years ago

so let me have a look again... maybe that knowledge is enough indeed.

vondele commented 4 years ago

thanks @tomtor for pointing me to the right variables!