beta coefficient in discounted CFR

bupticybee / TexasSolver

🚀 A very efficient Texas Holdem GTO solver :spades::hearts::clubs::diamonds:

https://bupticybee.github.io/texassolver_page

GNU Affero General Public License v3.0

1.8k stars 313 forks source link

beta coefficient in discounted CFR #166

Closed mvuthegoat closed 1 year ago

mvuthegoat commented 1 year ago

Should we change beta to beta_coef here? I read the paper on Discounted CFR (https://cdn.aaai.org/ojs/4007/4007-13-7066-1-10-20190704.pdf?_gl=1*le8l1f*_ga*Mzc2NzU4NDM0LjE2ODE3NjcxMDM.*_ga_CKNBPFEYPG*MTY4MTc2NzEwMi4xLjAuMTY4MTc2NzExMS4wLjAuMA..), and it says multiply negative regrets by the beta coefficient instead of just beta as shown in the code. I might be missing something, though.

bupticybee commented 1 year ago

I review the paper and the code, in the paper they refer to "α = 3/2, β = 0, and γ = 2 " as a optimal parameter. In their formular

beta_coef equals 1 / (1+ 1) , which would always be 1/2. So I replace coef_beta with a constant 1/2 in my implementation.

bupticybee commented 1 year ago

I run your code, and compair it with baseline, and it's performance is worse than baseline. I will delay the merge until the performance problem is solved.

ps: always create pull request to gui branch, which most change happens.

mvuthegoat commented 1 year ago

Oh, I see you use beta as 1/2 directly, which is indeed 1 / (1 + 1). You can keep the code the same, though:)