jacksonthall22 / SAN-strings

Generate all possible Standard Algebraic Notation (SAN) strings for chess moves
6 stars 0 forks source link

SAN Strings

This simple script generates all 29,274 possible Standard Algebraic Notation (SAN) strings for chess moves, with logic to avoid listing SAN strings that can never actually occur for geometric reasons.

If someone notices missing strings, or strings which are generated but can never occur, please open an issue!*

* But after the last update, I truly think I got them all!

Note

Check (+) and checkmate (#) symbols are omitted in san_strings.txt but included in san_strings_with_symbols.txt. It is fairly easy to convince yourself that no special logic is required to determine which subset of all SAN moves could deliver check/mate: all moves can deliver either check or checkmate at least via a discovery. Therefore, san_strings_with_symbols.txt (29,274 lines) is exactly three times the length of san_strings.txt (9,758 lines) as it simply makes two additional copies of each SAN move, one appending + and one appending #. SANs in both files are sorted by the key (len(san), san).

Run it yourself

git clone https://github.com/jacksonthall22/SAN-strings.git && cd SAN-strings
pip install -r requirements.txt
python gen_san_strings.py

How it works

What are discriminators?

Generating every possible SAN move in chess would be simple if not for discriminators, which occasionally must be inserted into the "standard" SAN move to disambiguate between multiple legal options.

If we have a piece, a from_square, and a to_square, how do we decide which (if any) discriminators the SAN move might require? I'm glad you asked—this repo has the answer.

SAN generation algorithms

In a previous version of this code, special logic was used to manually handle all the edge cases for the different piece types. Unfortunately, this failed miserably, generating some syntactically-legal SAN moves that could never actually occur and failing to generate some that could. When I dug into the issues more deeply, I realized this problem is quite difficult to reason through.

Now, the code uses three much more logical and generalized algorithms that were designed to resemble how a human might decide whether a particular from_square -> to_square move by a certain piece type could require a rank, file, and/or full-square discriminator respectively, based on which other squares a piece of the same type could come from when moving to to_square. Read on for the pseudocode, as well as some more intuitive explanations.

Note that only knight, bishop, rook, and queen moves ever use discriminators. Pawns have their own special move syntax and there is never more than one king per color. The details about how SAN strings are generated for these piece types is omitted below because it is straightforward and comparatively uninteresting. However, you can find the implementations in gen_san_strings.py at get_pawn_sans() and get_king_sans().

Pseudocode

File discriminators

Rank discriminators

Full-square discriminators

Intuitive explanations

To really understand the code in this repo, we need to understand the algorithm a human uses to determine whether a move from a from_square to a to_square might require a file, rank, and/or full-square discriminator.

First we should consider that if moving from from_square to to_square is a legal move, then even if there is another piece of the same type and color on the ray which extends from to_square to from_square and continues on to an edge of the board, moving that piece to to_square would be illegal. If this piece falls between from_square and to_square, then the original move would not be legal, so we have a contradiction. If it is past from_square (on the extension of the ray between the squares that continues to the edge of the board), then it is not legal because it cannot jump over the piece at from_square to reach to_square.

This is important when considering disriminators because we are only interested in squares from which another piece can legally move to to_square, and those which might create a situation where a rank, file, or full-square discriminator is necessary.

File discriminators

With this in mind, the algorithm for determining whether we need a file discriminator is as follows:

Rank discriminators

The algorithm for determining whether we need a rank discriminator is similar, but has some differences which account for the fact that a file discriminator is preferred over a rank discriminator when both can disambiguate the move:

Full-square discriminator

Determining whether we need a full-square discriminator is actually the simplest: