coreylowman / hanabi-mcts

1 stars 0 forks source link

Discount reward in rollout based on how probable the rollout is #10

Open coreylowman opened 3 years ago

coreylowman commented 3 years ago

A determinization that is lower probability should be weighted less than a determinization that is higher probability...

This will also encourage using hints more to reduce uncertainty.

But is this already handled by the fact that a determinization that is low probability will already be sampled less?