Open barakugav opened 2 years ago
What do you think about the noisy-beginning idea? I prefer not to use external knowledge about the game (e.g. opening book)
The noisy beginning will not result in an equalized position for the rest of the game, im not sure. Why not opening book? only to choose initial position
I think it could be okay that the position is not equalized, because (a) the moves are still chosen by the players (just with some added 'luck'), (b) that's why we do many comparisons - there is some 'luck' involved I think it's much more elegant if the whole training flow has no human knowledge, or at least, no more human knowledge than in AlphaZero. Or at least, we should have a default workflow as such. (and possibly other ones too)
BTW you probably saw it, but I like this formulation in their paper
Alright, i agree its more elegant without opening book, but im still think we should look for a better solution. We will not run hundreds of comparison games... First of all we can ensure both playres game the same 'luck' by running twice from the same noisy position with the players switched. But i still think it will cause us to miss evaluate the models
Do you think we can rely here on the noise from the floating-point errors in the network activation? And multithreading
fp no, multithreading yes, but we dont have multithreading in a single search, we have multithreading of multiple searchs, so currently it doesn't have any affect
In the paper it say "t -> 0", so maybe they just use very small temperature, i think that reasonable
Maybe sample random openings from an opening book, which are still considered equal