aikupoker / deeper-stacker

DeeperStacker: DeepHoldem Evil Brother
38 stars 3 forks source link

training loss #8

Closed light3317 closed 6 years ago

light3317 commented 6 years ago

Hi guys, any ideas how penalty scores for training losses is calculated? For instance, for one particular hand it loses 200, and another hand it loses 20000. Does the program takes account of actual win or loss amount and assign them with different weights proportionally, or it considers win or loss with equal weight regardless of the amount?

herrefirh commented 6 years ago

where do you see these numbers?

light3317 commented 6 years ago

Just examples for particular training hands.

herrefirh commented 6 years ago

oh makes sense. i read it in a funny way in the morning. i don't know the answer, i'm trying to figure it out.

i'm not even sure the game is being played out in the way you are speaking about or if it just takes the expected value (probability of winning * pot size)

you can see in terminal_equity.lua two functions that create matrices for the showdown equity