datamllab / rlcard

Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.
http://www.rlcard.org
MIT License
2.87k stars 619 forks source link

`last_landlord_action` and `last_teammate_action` in `DoudizhuEnv` are wrong #292

Closed kingyiusuen closed 1 year ago

kingyiusuen commented 1 year ago

It seems that a break statement is missing inside the for-loops for calculating last_landlord_action and last_teammate_action. Consequently, the first action is always being retrieved instead of the desired last action.

https://github.com/datamllab/rlcard/blob/63ab6a42f6b7eba058d983e89d72c451a78284f6/rlcard/envs/doudizhu.py#L60-L63

https://github.com/datamllab/rlcard/blob/63ab6a42f6b7eba058d983e89d72c451a78284f6/rlcard/envs/doudizhu.py#L69-L72

daochenzha commented 1 year ago

@kingyiusuen Thanks for pointing this out. I agree that this could be a bug. Do you want to send a pull request to fix it?