datamllab / rlcard

Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.
http://www.rlcard.org
MIT License
2.91k stars 630 forks source link

Test your Poker Agents on pokerwars.io #135

Open systats opened 4 years ago

systats commented 4 years ago

Hi Guys,

just wanna let you know that there is a Free Poker Bot Platform to test your agents in a more heterogenous environment. I think the current bots are already pretty competitive although it would be nice to compete against more ml experts. There are around 20-40 bots online almost 24/7.

Jump to pokerwars leaderboard or check out several API languages on pokerwars github

Hope to see some of you there and exchange some insights.

Cheers, Simon

daochenzha commented 4 years ago

@systats Thanks for the information. The platform looks really cool.

systats commented 4 years ago

Would you mind promoting the platform in your readme? As you are very popular for card agents I imagine this could greatly improve the outreach and boost competition on the platform. Many could benefit from this info at the right time. I am not at all related to pokerwars other than using it heavily.

daochenzha commented 4 years ago

@systats I very much like to promote the platform since it really looks cool to me. I feel like RLCard can be an offline testing environment to help people create a bot while the platform can be a real-world testing environment. Thus, I believe it would be better to have instructions about how to wrap the agent in RLCard and run it on the platform.

I am not familiar with the platform. I am not sure whether the agent in rlcard can be used in the platform. Do you think it can be done? It would be nice if you or their team could provide some running examples of how rlcard agent can be connected. I believe an example is much better than simply mentioning it on readme.

As a side note, I find the platform uses different rules , such as the number of rounds, the chips and the ways to calculate points. Do you think the difference between rlcard and the platform would be a problem?

Again, I like the platform very much. I believe it would be nice to make rlcard and the platform more easily connected so that users from both sides can better do offline and online testing.

systats commented 4 years ago

There is a python bot skeleton on github. It has also a very nice docs provided with a lot of details. We implemented everything in R so it is probably not as useful. The only real requirement is to return the number of chips intended to bet. This means you need an own validate action function to translate your pot multiplayer into stakes.

Although the ultimate validation for a poker bot would be to grind cash tables it might not be the most ethical thing to do. But in order to evaluate your algo a simple always folder or caller bot will also bias your results with high certainty. Thus before you gonna present your model to the real world you might wanna play against bots that indent to do the same.

I think there is no real limit on the number of rounds rather bots are inflating the pot rapidly so we see no more than 4 rounds pre-flop. But your concerns are rightfully. What if either RLCard or Pokerwars return deviating rewards (from the ideal world)? If we turn this thought around, in the case RLCard performs excellent, you can be assured your bot performs out of environment with much more competitive opponents than usually set for simulation.

I think there is the need for a platform like pokerwars, maybe for other games too. Not only for checking reward misspecification or hacking but also to tune a model in real time. So I am looking forward to see RLCard implementations on pokerwars and maybe a link to their platform.

PS: If you know of similar projects to pokerwars or other approaches to test your bot let me know.

daochenzha commented 4 years ago

@systats I have glanced at the interfaces of pokerwars. I feel like some efforts would be needed in state representation if we want to wrap RLCard using the bot keleton. Specifically, the states in rlcard are obtained in the environment while the states of pokerwars have to be processed in the bot. Also, the environment in rlcard has different fields in the state.

Thus, I would propose to create a sperate environment specifically for pokerwars, with the same rules and the same states/actions (an offline environment of pokerwars). We also need a wrapper for the agent in RLCard to process the raw state data. This requires some engineering efforts. I would not have time to do this in the near future. So I may have to hold this issue for a while and ask for help from the community.

I will put the link to pokerwars into the Evaluation section of the README (hopefully, it will help) and come back to the implementation later when I have time. Also, it will be appreciated if anyone could contribute some codes or some suggestions.

systats commented 4 years ago

@daochenzha thank you and sounds good. We will see whether we can contribute in that direction.