Closed big-c-note closed 4 years ago
(Citations on Algos in the Original Paper and Supplementary Material)
Let me know if/how you'd want this in the code base, and I can get you a PR. Also including a bit of the back drop for each paper listed.
On Abstraction
uses abstraction to reduce possibilities
eliminates some decision points
for example, bet sizes - there are only 14
MCCFR
uses Monte Carlo CFR
(Just an Overview of MCCFR, not the particular variant that is used in Pluribus)
https://papers.nips.cc/paper/3306-regret-minimization-in-games-with-incomplete-information.pdf
uses linear CFR for first iterations (400?) in self play:
(Discounted Regret Minimization; Also by Original Authors)
https://arxiv.org/pdf/1809.04040.pdf
if subgame large or beginning of the game, then use linear MCCFR
then updates to a variation that only samples the rest of the tree
(Talks About Sampled Form of Regret-Based Pruning in Libratus)
https://www.cs.cmu.edu/~noamb/papers/17-IJCAI-Libratus.pdf
(Goes On to Explain for Extensive Games)
https://papers.nips.cc/paper/3713-monte-carlo-sampling-for-regret-minimization-in-extensive-games.pdf
more about the above algo:
new search algo (above) allows the searcher to also try the k=4 different strategies (previous versions did not)
used a new form of nested unsafe search: solve at the begining of the betting round as opposed to current decision point
additionally, the opponents could have different strategies (k=4) than they actually used in that same betting round
however, the bot remains fixed to their strategy (just for this betting round)
MISC
Some original author papers: https://www.cs.cmu.edu/~noamb/research.html
not really related :) https://arxiv.org/abs/2002.05820
This is awesome. Feel free to open up a PR with a branch based off branch develop and add this information to the README.md or some other .md file
develop
README.md
Thanks!
Great, no problem: PR #19
(Citations on Algos in the Original Paper and Supplementary Material)
Let me know if/how you'd want this in the code base, and I can get you a PR. Also including a bit of the back drop for each paper listed.
On Abstraction
uses abstraction to reduce possibilities
eliminates some decision points
for example, bet sizes - there are only 14
MCCFR
uses Monte Carlo CFR
(Just an Overview of MCCFR, not the particular variant that is used in Pluribus)
https://papers.nips.cc/paper/3306-regret-minimization-in-games-with-incomplete-information.pdf
uses linear CFR for first iterations (400?) in self play:
(Discounted Regret Minimization; Also by Original Authors)
https://arxiv.org/pdf/1809.04040.pdf
if subgame large or beginning of the game, then use linear MCCFR
then updates to a variation that only samples the rest of the tree
(Talks About Sampled Form of Regret-Based Pruning in Libratus)
https://www.cs.cmu.edu/~noamb/papers/17-IJCAI-Libratus.pdf
(Goes On to Explain for Extensive Games)
https://papers.nips.cc/paper/3713-monte-carlo-sampling-for-regret-minimization-in-extensive-games.pdf
more about the above algo:
new search algo (above) allows the searcher to also try the k=4 different strategies (previous versions did not)
used a new form of nested unsafe search: solve at the begining of the betting round as opposed to current decision point
additionally, the opponents could have different strategies (k=4) than they actually used in that same betting round
however, the bot remains fixed to their strategy (just for this betting round)
MISC
Some original author papers: https://www.cs.cmu.edu/~noamb/research.html
not really related :) https://arxiv.org/abs/2002.05820