Open linux-leo opened 8 months ago
Hey @linux-leo,
Awesome to see your interest in the project! Just got back from a travel trip, so I'm catching up. Appreciate your suggestions on improving the training dataset, you've got some great points there.
To add few possible improvement:
And your offer for compute power? Legendary! I managed to have access to an H100, so we should be golden for now, still, thanks a bunch for having my back :)
Feel free to drop more thoughts whenever they pop into your head. 🚀
See: https://database.lichess.org/#standard_games
Maybe use every nth game from the year 2013 before lichess grew in size, so the dataset covers a more or less equal amount of games per month while still covering a large time span, and to reduce the amount of games that need to be processed.
PS: I'm happy to provide some compute for this project with my google colab pro+ Subscription :)