henrycharlesworth / big2_PPOalgorithm

Application of proximal policy optimization algorithm to the card game Big 2 using Tensorflow
75 stars 28 forks source link

How to build actionIndices.pkl? #10

Open t10250119 opened 1 year ago

t10250119 commented 1 year ago

Excuse me, I would like to ask how you trained the actionIndices.pkl file. Currently, I'm trying to develop a variation of the game 'Big Two' with different regional rules. However, I'm stuck because my rules involve not playing four-card combinations and allowing one random card with four cards of the same rank. I'm unsure if I need to modify the contents of the actionIndices.pkl file. Can you provide any guidance on this

henrycharlesworth91 commented 1 year ago

Hi, I'll preface this with saying that I don't remember a lot of the details because it was a long time ago I did this, but basically the actionIndices.pkl is just a mapping of indices to a fixed size action space (that we can have a neural network policy over).

There's more information about the details in Appendix B here: https://arxiv.org/pdf/1808.10442.pdf

I have to admit the code is not written very well looking back on it and making modifications is likely to be a bit tricky, but if you want to modify the available 4-card combinations you'd have to regenerate actionIndices.pkl with different four-card action lookup tables, and then ensure these modifications are picked up in enumerateOptions.py and the actual game simulator.