Contact.csv File Contains Impossible Probabilities that are Misaligned with GNU Backgammon Evaluations

OfirMarom commented 4 months ago

Hello and thank you for this excellent project.

I have been working with the files you provided for training and have found an issue. Consider wildbg-training/0012/contact.csv position id mxmAQ4PbAQAAAA.

position_id win win_g win_bg lose_g lose_bg mxmAQ4PbAQAAAA 0.9614198 0.8726852 0.36959878 0 0

implying a win_bg probability of 0.36959878. Now, this is impossible since both players have an off checker. This can be verified by pasting the position id in gnu gammon. Furthermore, gnu gammon as a very different analysis of the position.

Win W(g) W(bg) L(g) L(bg)
static: 1.000 0.000 0.000 0.000 0.000

which appear to be correct.

Now, when I take the off checkers of player O and put them on the bar instead of off the board, I get the probabilities that are closer to wildbg.

Win W(g) W(bg) L(g) L(bg) static: 0.951 0.876 0.196 0.000 0.000

So I suspect that in these positions the checkers that are supposed to be on the bar are not being accounted for are instead being placed in the default off section of the board. A further piece of evidence that supports this is that I have evaluated every position in contact.csv file and not a single one has a bar checker for player O. I haven't looked at the code yet, but based on these empirical findings this is what I suspect is currently happening.

Other example positions in wildbg-training/0012/contact.csv that exhibit the same behavior are: x8wGALDbbSAAAA, NncAANi9cwAAAA and mzlgIGC3BwAAAA.

Attached is a file that demonstrates what I have found more clearly with image from gnu gammon.

example.xlsx

carsten-wenderdel commented 4 months ago

Hi Ofir,

thanks for your extensive analysis and documentation of this bug.

I did some tests of the encoding method https://github.com/carsten-wenderdel/wildbg/blob/0491186b64a6c0614b3bb8b4960a12229515f939/crates/engine/src/position/conversion.rs#L45 and it seems that you are very right: bar checkers for player O are encoded as if they were off.

More debugging shows that when decoding that position ID, we don't get back the original position, but a position with zero bar checkers. I have to double check, but it seems that the training process will be badly influenced by this. Basically the neural nets should not be able to give good estimates for positions with bar checkers for player O or checkers off for player O.

It's probably also difficult or maybe even impossible to fix the training data. I think I will create completely new positions and roll them out. Not sure when this will happen, probably late 2024.

Thanks again for your analysis, this is super helpful! I don't think I would have found that bug soon, instead I would have wondered why the nets don't reach world class!

By the way, if you want to train your own nets, maybe the gnu rollout data is worth checking out: https://alpha.gnu.org/gnu/gnubg/nn-training/1.00/training_data/ I want to keep wildbg apart from gnubg, so I've started the training process from scratch and the results are still far away from gnubg.

OfirMarom commented 4 months ago

Hi Carsten,

Thank you for the quick feedback. I didn't know about the gnu rollouts that is very helpful!

In case anyone wants to use these files in future, please note that the positions in the gnu rollout data files are base16, not base64 as per the gnu position ids in the GUI (as well as the encoding that wildbg). You need to convert the string to bytes using this transformation:

def gnubg_base16_to_bytes(encoded_str):
    base16_chars_to_int = {
        'A': 0,
        'B': 1,
        'C': 2,
        'D': 3,
        'E': 4,
        'F': 5,
        'G': 6,
        'H': 7,
        'I': 8,
        'J': 9,
        'K': 10,
        'L': 11,
        'M': 12,
        'N': 13,
        'O': 14,
        'P': 15
    }
    value = 0
    for char in encoded_str:
        value = value * 16 + base16_chars_to_int[char]
    num_bytes = (value.bit_length() + 7) // 8
    decoded_bytes = value.to_bytes(num_bytes, 'big')
    if len(decoded_bytes) < 10:
        decoded_bytes = b'\x00' * (10 - len(decoded_bytes)) + decoded_bytes
    elif len(decoded_bytes) > 10:
        raise ValueError("Decoded bytes exceed expected length of 10.")
    return decoded_bytes

I am doing some research on Backgammon and have found this project to be a valuable resource. So thank you for the effort in building and maintaining this :)

Regards, Ofir

carsten-wenderdel commented 1 month ago

I've just fixed the position ID bug in the code base. Rollout data is still affected, I plan to create new rollout data later this year. Until then I'm going to leave this GitHub issue open.

carsten-wenderdel / wildbg

Contact.csv File Contains Impossible Probabilities that are Misaligned with GNU Backgammon Evaluations #27