Closed VasilGeorgiev39 closed 10 months ago
Thanks! That's a useful thing to add. In the process of testing this addition, I realized that in my initial code, it is "1.e4 e5 2.Nf3 ...". But, official PGN notation is "1. e4 e5 2. Nf3 ...". There's supposed to be a space between the "." and move. For compatibility with the format that GPT-3.5 probably saw in its training dataset, this should get fixed.
It appears that making this update is causing some moves to fail to parse, so I'll have to dig into that issue before I merge this. Feel free to look into it as well. Not a problem with your code, just something that should get fixed before this is merged in.
This update may end up increasing GPT-3.5's playing ability, we'll see.
If you're up to it, I would be curious to see a comparison of win rate and legal move rate with and without randomly initialized games
Thanks! That's a useful thing to add. In the process of testing this addition, I realized that in my initial code, it is "1.e4 e5 2.Nf3 ...". But, official PGN notation is "1. e4 e5 2. Nf3 ...". There's supposed to be a space between the "." and move. For compatibility with the format that GPT-3.5 probably saw in its training dataset, this should get fixed.
It appears that making this update is causing some moves to fail to parse, so I'll have to dig into that issue before I merge this. Feel free to look into it as well. Not a problem with your code, just something that should get fixed before this is merged in.
This update may end up increasing GPT-3.5's playing ability, we'll see.
Nice catch.
I just tried it and indeed with the 'correct' notation I see more illegal move attempts, but to me it seems that in most cases the illegal move is an attempt to do an illegal long castle. It is interesting that changing the notation prompts it to castle so strongly
Interestingly I also see an increase in illegal moves with the 'wrong' notation as well, I wonder if anything has changed since I tried it last time. Again it seems to have a favourite move that it tries to do in many situations (in this case Ke2). I don't have a good explanation for those and I am not sure where to look in further so I'll probably leave it at that.
I believe the issue was due to the prompts now including a trailing whitespace, which is not kosher for GPT prompts. I modified it and GPT now plays much better.
Adds functionality to randomize the opening N moves Useful to prove that the model has not just memorized the openings I tried randomizing the first 20 moves and gpt-3.5-turbo-instruct did not make single illegal move in 10/10 games