pmariglia / showdown

A Pokemon Showdown Battle Bot written in Python
GNU General Public License v3.0
261 stars 178 forks source link

Timer #61

Closed mancho1987 closed 3 years ago

mancho1987 commented 3 years ago

Hi, I would like the bot to keep track of the timer, because I made many modifications to the state evaluating algorithm and my bot sometimes runs out of time. So I would like to make it skip some steps if the timer is running out to avoid losing. Could you please guide me on how to do this?

pmariglia commented 3 years ago

Oh weird, I remember having this feature but I must've removed it for some reason.

9c4ebaa736a8c1f044a36279578578ea3dd1f9d0 re-adds this. You can see the latest timer value the server has sent with battle.time_remaining. A value of None means the timer is off.

Let me know if this solves your problem!

mancho1987 commented 3 years ago

Thanks! Another thing, I would like to weight each state also according to the chance that the opponent has the move considered in that state. For example, if the state was the result of opponent using flamethrower, and the chance of the opponent having that move is 100%, then the score for that state would be not modified. But if the chance of having that move is 25%, then I would like to modify the score by some amount, to not give as much importance to unlikely states as to the likely or certain ones. Any pointers on how to do this would be greatly appreciated!

pmariglia commented 3 years ago

I would like to weight each state also according to the chance that the opponent has the move considered in that state.

I've thought of doing something similar, but there's a bit more to it than you may realize.

Say 1 move is revealed, but there are 5 other possible moves the Pokemon may have. Each of the moves have a 50%, 60%, 70%, 80%, and 90% chance of being present, respectively. What is the score modification ratio you would apply? Are you only generating states where 4 moves are assumed per Pokemon, or are you generating states where they can have any number of moves? If it is the former, then there is also information that the Smogon stats do not provide - like "how likely is this set of 4 moves?". It gets complicated rather quickly.

Anyway, if you know what you want to do, here's where you'd do it.

This is where the states are generated (in the safest battle-bot anyways). What I'd recommend is adding a state_multiplier attribute to each state object that is generated. Then, when evaluating the state, you can simply multiply by your multiplier.

Hope that helps.

mancho1987 commented 3 years ago

Thanks, it did help! I am testing it now, so far have noticed a slight elo increase but it's too soon to tell.

By the way, what do you think of this idea I just had: Calculating for the 2 best moves insteas of just the best one, and then among just those two moves, search another turn or two ahead (using the 2 depth worst case scenario for each as their starting point) to decide which one to choose. So for example, if the two best moves are psychic and recover, and their worst case scenarios for each respectively are if opponent plays surf + surf, surf + toxic, then it would search one or two more turns ahead only for those specific states. Do you see this helping in something? Would it make it much slower?

pmariglia commented 3 years ago

so far have noticed a slight elo increase but it's too soon to tell.

Careful: with the random nature of competitive pokemon, you'll need to track elo over hundreds, maybe even thousands, of battles to properly determine how good a particular bot is. Even a new bot consistently beating the old bot doesn't necessarily mean it will be better on the ladder.

By the way, what do you think of this idea I just had: Calculating for the 2 best moves insteas of just the best one, and then among just those two moves, search another turn or two ahead (using the 2 depth worst case scenario for each as their starting point) to decide which one to choose. So for example, if the two best moves are psychic and recover, and their worst case scenarios for each respectively are if opponent plays surf + surf, surf + toxic, then it would search one or two more turns ahead only for those specific states. Do you see this helping in something? Would it make it much slower?

It would make it much much slower, and the algorithm isn't exactly clear on how it works just by your definition alone.. By using the 2 depth worst case scenario for each as their starting point you'd get two pairs of moves, one for the turn you are currently on and one for... the state that those first two moves transpose the current state into?

But wait.. what if there is randomness associated with the first pair of moves? Now what? Should you look at all transpositions and apply expectiminimax to that once again based on probability? That's slow. Assuming each move has 1 piece of randomness (2 outcomes), thats 4 total states from two pairs of moves - and that is per turn, we are talking about an initial 2 turns here. So you'd end up needing to search an additional turn for probably an average of at least 10+ states (I'm just guessing here).

Now, alpha-beta pruning would definitely help eliminate searching through some of these states so it's probably not as bad as I'm making it out to be. I'm guessing this bot would end up taking ~30 seconds per move or so (at least for a machine like mine....)

By all means I'd love for someone to try it and report back - but I don't see it improving the bot too much. I think the bot could search 10+ turns and still be crap. Minimax is the wrong way to play pokemon, there's just too much randomness. Competitive Pokemon is about determining win-conditions, identifying how to put yourself in a position where you can sweep, and balancing tradeoffs of moves (usually due to potential hax).

mancho1987 commented 3 years ago

I understand that. So far the bot has been playing non stop for 5 days, and it's costantly in the 1300-1400 range, which the occasional dips below that, which is an inprovement to my previous bot. Yes, if I wanted to make it beat more easily another bot then I would just make it set up more often 🤣 I have not been able to make this bot less of a set up fodder.

I have tried using depth 3, not from the start though as it loses from timer. I do it once there are 6 mons alive, so there are way less possible states. Honestly, I have not seen a significant improvement from this, and I thought it was (apart from randomness) because there are items, abilities and other stuff that the bot doesn't understand. I have been adding some of them and if you could provide a list of things you know are missing it would be a great help.

Regarding your last point, I have tried to add some logic to help it do exactly that (identify sweepers and getting into a position where it can sweep). Still need to work on it though, it's far from good 😅

pmariglia commented 3 years ago

Yeah the difference between seeing 2 turns ahead versus 3 is likely negligible when it comes to using expectiminimax in Pokemon.

If you have changes for better understanding items/abilities/etc. then I'd love for those changes to be contributed back to the main branch.

I have tried to add some logic to help it do exactly that (identify sweepers and getting into a position where it can sweep). Still need to work on it though, it's far from good sweat_smile

I've tried doing this - it's not easy :cry:

If you'd like to talk more directly, send me a message on Discord: pmariglia#5568. I'm going to close this issue as the timer has been re-added to the bot