in the Nash-equilibrium mode the bot makes very questionable decisions. is this intentional?

timothychen04 commented 3 years ago

here are some examples.

a. after rilllaboom gets choice band locked into drain punch and opponent goes to a ghost type it keeps using drain punch even though the opponent is immune b. if a pokemon's only attack is a move that has an immunity ex. ground againist flying, and the pokemon is facing said type it will keep using the move that is immune instead of switching or using a utility move. ex. a hippowdon -slack off -whirlwind -stealth rocks -earthquake againist a pelipper would spam EQ even though it was immune, c.when i had my rillaboom up againist a heatran it sacked melmetal and brang rillaboom back in resulting in both of them getting OHKO'd d. when there was a Tapu KoKo as my opponent the bot brought out slowbro and dragonite before rillaboom which would've easily KO'd with grassy glide e. the bot spams future sight consecutively as if it thinks its a insta damage psychic move. f. a pokemon has regenerator and a swicth out move ex. u-turn and is faster then the opposing pokemon the bot will opt to hard switch instead of getting free chip againist the oposing pokemon by using for example u-turn. g. tried to toxic a mon that already has a status condition h. when out of damaging moves will spam status move instead of switching out allowing opponent to set up with swords dance, quiver dance etc i. doesn't recognize magic bounce and spams status moves j. went for a dragon move against a fairy type when an ice move was available

if you got here thanks for taking time to read the whole list :)

pmariglia commented 3 years ago

Obviously it's not intentional - its just a really bad decision making method. It's labelled "experimental" because I know how bad it is. Ladder performance is much better with the safest battle-bot, even though that one sucks as well.

For completeness: What the bot does is calculate a payoff matrix for what it thinks the game state is for a search depth of 2 (no alpha-beta pruning). This gives an NxM matrix of payoffs where the rows represent the bot's choices and the columns represent what the bot thinks the opponent's choices are. Something like this:

Then a Nash-Equilibrium is calculated from that payoff matrix.

There are so many things wrong with this, a couple are:

This is not the right way to model the payoff matrix for a game with incomplete information like competitive Pokemon.
Only searching to a depth of 2 doesn't give the bot an understanding of the long-term strategies and win-conditions. This is a problem with using a search-based method as well: Pokemon is more complicated than just looking a few turns ahead.

.. and probably other game theory concepts that I am not even aware of.

This was really just an experiment I did to get a battle-bot that used a non-deterministic decision method. It was never meant to be the very best like no one ever was.

timothychen04 commented 3 years ago

Thanks For the response I am very new to this so this helped alot.

pmariglia / showdown

in the Nash-equilibrium mode the bot makes very questionable decisions. is this intentional? #65