zapper-95 / Coup-RL

Models trained to play the card game Coup
0 stars 0 forks source link

Bug - Action mask allowed challenge as first action for an agent #3

Closed zapper-95 closed 8 months ago

zapper-95 commented 8 months ago

image

zapper-95 commented 8 months ago

I think the reason performance falls, is because of the difference between None and "none".

"none" was being passed into the observe() function and everything in there was tuned for None. This meant that the legal moves are all of them, except coup assisinate and counteract. That leaves 6.

The self.can_challenge() does not work as the dictionary it checks for None. This allows player 1 to with probability 1/6, loose there first card.