Open Ndiedrick21 opened 2 years ago
If this is agreed upon, more planning can be done
Nathan I think your concern is twofold:
Anyone else have any other ideas or comments? Definitely looking for input on this --- this is just my initial thoughts.
I think you could make the extensions on the game and policy input side now, and just do some feature engineering to derive the simple state values to do the simple rl stuff to begin with. This would make the agents more backwards compatible when we do want to use more advance rl methods. Both will still work. Does that make sense? This overall may include another function that the games calls instead of policy, so that policy can be specific to what features you want to use
Nathan that is a good point, here is my initial thought: an agent's (agent means rl --- player is not rl but not the dealer) policy will intake the dealer's value and also args which will represent the value of any other player in the game. This will allow for the back-compatibility as you suggested and as far as actually feeding that into rl we can tackle that later as we progress on the rl side, we just have to make sure setting up our rl algorithm can be flexible to hand args. What do you think or this? Did you have something else or something more in mind?
To address the other stuff directly I do not think feature engineering is required (we can definitely do this but I think this starts to get into deep rl). Again though if others are interested we can definitely start going this direction. I am not tracking you on adding another method to the game class though. What is the purpose of this exactly or what situation makes you think this will be necessary/ why should we do this?
I think the idea for another function would be something like play_turn and the inputs would be all available visible information from the game, including all of the cards that each player and the dealer has. From there you could call policy within the class that uses dealer value, or whatever you want your policy to be based on.
play_turn would be called by the game instead of policy. Since eventually all of this information may be used by the agent, i don't think it would hurt to model the game mechanics around that from the beginning. Unless you don't think it would be worth the time, based on what you want to get out of this project.
I am not sure exactly why the game would need to be structured as you are suggesting ... let's get on a call to discuss this. Do you have any preference on time?
@Ndiedrick21 I added functionality to take in the other players hand in play_round, single_player_hand (I spelled this function wrong), as well as in the policy of a Player. As we discussed the implementation is not the hard part.
@hall4jm @ruetten @Ndiedrick21 guys the difficult parts of this adjustment are:
Let's take some time to think about this since I think this is an important part of the game mechanics but if you guys have any ideas definitely let the group know.
Hey I'll take a look at this sometime this weekend, sorry I'm a little late to the party hahaha
Currently when an action is requested of a player, the only information they have is their own hand, and the value of the dealer's visible card. In a full game of blackjack. More information can be known with more players. I understand that this was done initially to start with finding a simple policy, but if this is to be extended, it would be nice for the player to have as much information as possible to make a decision. If this were too be expanded, a simple policy could still be developed, but it would allow for more complex RL methods to be done in the future.
These changes would mainly including giving the players all of the visible cards in play, as well as the specific card that the dealer has. I think a mechanic could be added either to the card/hand class or the game class to control which cards are visible to which players.
I think it would be worth putting time into developing this out further before even looking at simple RL methods. This would help create a more complete blackjack game before exploring RL, rather than having to make these changes later.
Share your thoughts in the comments.