datamllab / rlcard

Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.
http://www.rlcard.org
MIT License
2.87k stars 619 forks source link

How to access the DeepCFR agent? Not seeing it in the Agents anymore + Other questions about scoping new game addition #306

Open DeepTitan opened 9 months ago

DeepTitan commented 9 months ago

Hey guys! I am attempting to repurpose the library for my favorite TCG. Would be happy to add my contributions once I'm done since this game has two different decks for each player. I want to train using Deep CFR since thats what I understand to be the state of the art as far as infrastructure goes. I have made a pretty good first draft of modeling the state, action, reward of the game.

I have a few questions:

1) Why is DeepCFR not in the agents anymore? I noticed that on this branch back in 2021 https://github.com/datamllab/rlcard/blob/34830b6b41fed0ab9119e67c02eab2898efe3c5d/tests/agents/test_deepcfr.py

but after that one I see that it was removed. Any particular reason for that or is it somewhere I am just not seeing?

2) Any advice for me for actually training this thing? I am proficient in AWS. Any tips/ notes for me when it comes to actually trying to train something that approaches super human ability? I read that the Bridges supercomputer, with 196 nodes and each node having 128 GB of memory, has a total of 25,088 GB (or approximately 25.09 TB) of RAM, which was used to train some super human No Limit Poker.

What instance types, instance count, length of training, 3rd party libraries, ect would you recommend in order to make the distributed training effective? What do you think the estimated cost will be to train and what should I expect when it comes to experimentation and failures?

ZJsheep commented 4 months ago

Same question. BTW, thank you for pointing out the position that it used to exist.

Hey guys! I am attempting to repurpose the library for my favorite TCG. Would be happy to add my contributions once I'm done since this game has two different decks for each player. I want to train using Deep CFR since thats what I understand to be the state of the art as far as infrastructure goes. I have made a pretty good first draft of modeling the state, action, reward of the game.

I have a few questions:

  1. Why is DeepCFR not in the agents anymore? I noticed that on this branch back in 2021 https://github.com/datamllab/rlcard/blob/34830b6b41fed0ab9119e67c02eab2898efe3c5d/tests/agents/test_deepcfr.py

but after that one I see that it was removed. Any particular reason for that or is it somewhere I am just not seeing?

  1. Any advice for me for actually training this thing? I am proficient in AWS. Any tips/ notes for me when it comes to actually trying to train something that approaches super human ability? I read that the Bridges supercomputer, with 196 nodes and each node having 128 GB of memory, has a total of 25,088 GB (or approximately 25.09 TB) of RAM, which was used to train some super human No Limit Poker.

What instance types, instance count, length of training, 3rd party libraries, ect would you recommend in order to make the distributed training effective? What do you think the estimated cost will be to train and what should I expect when it comes to experimentation and failures?

ZJsheep commented 4 months ago

Hey guys! I am attempting to repurpose the library for my favorite TCG. Would be happy to add my contributions once I'm done since this game has two different decks for each player. I want to train using Deep CFR since thats what I understand to be the state of the art as far as infrastructure goes. I have made a pretty good first draft of modeling the state, action, reward of the game.

I have a few questions:

  1. Why is DeepCFR not in the agents anymore? I noticed that on this branch back in 2021 https://github.com/datamllab/rlcard/blob/34830b6b41fed0ab9119e67c02eab2898efe3c5d/tests/agents/test_deepcfr.py

but after that one I see that it was removed. Any particular reason for that or is it somewhere I am just not seeing?

  1. Any advice for me for actually training this thing? I am proficient in AWS. Any tips/ notes for me when it comes to actually trying to train something that approaches super human ability? I read that the Bridges supercomputer, with 196 nodes and each node having 128 GB of memory, has a total of 25,088 GB (or approximately 25.09 TB) of RAM, which was used to train some super human No Limit Poker.

What instance types, instance count, length of training, 3rd party libraries, ect would you recommend in order to make the distributed training effective? What do you think the estimated cost will be to train and what should I expect when it comes to experimentation and failures?

I seem to find out the reason. The authors say they cannot make DeepCFR converge in this ancient issue: https://github.com/datamllab/rlcard/issues/38