Closed zsunberg closed 3 years ago
Thanks Zach! I'll check in with Kyle about this. We have two weeks until we submit the final manuscript to MIT Press, so we'll have to figure out whether we can get something polished within that time.
Sorry, I know I should have said something way earlier!
After discussing a bit among the coauthors, it is a topic that we have decided to include in our list for consideration for the next edition.
Hi guys,
First of all, I want to pass on that my students loved the book. Some of my teaching evaluations could be summarized as "the lectures were erratic, but the book was really helpful and I usually understood after reading it" :rofl:
Anyways, I wanted to file this issue because I think it would be really beneficial to add a section to the POMG section of the book. Currently, the algorithms that you list in this chapter are pretty limited. However, there has recently been quite impressive success playing no-limit poker, e.g. https://science.sciencemag.org/content/359/6374/418/tab-pdf . They use an approach called counterfactual regret minimization. I have to say I don't really understand it, and it seems quite different from the belief- or policy-graph-based approaches that we typically use. I think at the very least, it deserves to be mentioned. Adding a section explaining it would be a very good addition to the book given that it seems to be much more powerful than the algorithms that are currently there.