tobiasemrich / SchafkopfRL

AI agents for the bavarian card game Schafkopf trained with reinforcement learning
GNU General Public License v3.0
35 stars 7 forks source link

Some ideas regarding the networks #5

Open phimuemue opened 3 years ago

phimuemue commented 3 years ago

Hi, neural network noob here, so all of this is to be taken with a grain of salt.

From https://github.com/tobiasemrich/SchafkopfRL#state-and-action-space-lstm-variant I read that this software passes a "game_type", which I guess determines if we e.g. deal with "Eichel-Rufspiel" or "Gras-Wenz". I.e. the software gives the network a conglomerate describing the whole game. The network has to cope with essentially all supported game types. (If this interpretation is wrong, then the following probably does not apply.)

But isn't "Eichel-Rufspiel" de facto a whole different game than "Gras-Wenz"? And, wouldn't it, thus, make sense to reflect this by having one network per game type?

Surely this would mean a bit more code to deal with the games on a case-by-case basis. But it would probably give the following advantages:

In addition, I wonder if it would then make sense to have one single Rufspiel-network and to "normalize" each Rufspiel to e.g. an Eichel-Rufspiel (similar to that, normalize each Farbwenz to Herz-Wenz). Similarily to the above, this would need some code to normalize games, but would yield the benefit that it requires fewer neural networks, and - possibly more important - the software behaves consistent for Rufspiel, regardless of the suit.

tobiasemrich commented 3 years ago

Hi, thank you for your suggestions. You are completely right and these are great ideas to try out.

So far I have just considered one network for all game types just out of pure laziness. I was hoping that there is knowledge that is common in different game types (e.g., point values of cards, trying to play trumps first as game player, ...). But I agree, sometimes an action that is good in one game-type might be disastrous in another one.

One network per game type would mean one additional network for the bidding phase I guess, but that should not be a big problem.

I am currently using a real-world dataset to evaluate the network architecture necessary to predict the next action of players. I am currently achieving ~84% accuracy (random is ~50%). So when I have some time I will also try one network per game type just to see if I can improve this number. I would hope that this gives at least an indication if it is worth a try for the RL approach.

BTW if you are interested in contributing just let me know.