hsahovic / poke-env

A python interface for training Reinforcement Learning bots to battle on pokemon showdown
https://poke-env.readthedocs.io/
MIT License
297 stars 103 forks source link

Agent does not choose move #601

Closed alexrsanderson closed 1 month ago

alexrsanderson commented 2 months ago

Hi,

With this code I am able to have the agents in my local server, however they do not choose any move and stay idle.

` class EliteTrainer(EnvPlayer): _ACTION_SPACE = list(range(22)) def calc_reward(self, last_battle: Battle, current_battle: Battle) -> float: return self.reward_computing_helper( current_battle, fainted_value=0.1, hp_value=0, victory_value=1 ) def action_to_move(self, action: int, battle: Battle) -> BattleOrder: return self.create_order def embed_battle(self, battle: Battle) -> ObsType:

-1 indicates that the move does not have a base power

    # or is not available
    moves_base_power = -np.ones(4)
    moves_dmg_multiplier = np.ones(4)
    for i, move in enumerate(battle.available_moves):
        moves_base_power[i] = (
            move.base_power / 100
        )  # Simple rescaling to facilitate learning
        if move.type:
            moves_dmg_multiplier[i] = move.type.damage_multiplier(
                battle.opponent_active_pokemon.type_1,
                battle.opponent_active_pokemon.type_2,
                GenData(8).load_type_chart
            )

    # We count how many pokemons have fainted in each team
    fainted_mon_team = len([mon for mon in battle.team.values() if mon.fainted]) / 6
    fainted_mon_opponent = (
        len([mon for mon in battle.opponent_team.values() if mon.fainted]) / 6
    )

    # Final vector with 10 components
    final_vector = np.concatenate(
        [
            moves_base_power,
            moves_dmg_multiplier,
            [fainted_mon_team, fainted_mon_opponent],
        ]
    )
    return np.float32(final_vector)
def describe_embedding(self) -> Space:
    low = [-1, -1, -1, -1, 0, 0, 0, 0, 0, 0]
    high = [3, 3, 3, 3, 4, 4, 4, 4, 1, 1]
    return Box(
        np.array(low, dtype=np.float32),
        np.array(high, dtype=np.float32),
        dtype=np.float32,
    )

async def main(): opponent = RandomPlayer(battle_format='gen8randombattle') test_env = EliteTrainer( battle_format="gen8randombattle", server_configuration=LocalhostServerConfiguration, start_challenging=True, opponent=opponent )

for ep in range(10):
    state, info = test_env.reset()
    done = False
    return_ = 0.0
    timesteps = 0
    while not done:
        state, reward, terminated, truncated, info = test_env.step(
            test_env.action_space.sample()
        )
        test_env.render()
        return_ += reward
        done = terminated or truncated
        timesteps += 1
    test_env.close()
    print(f"Episode {ep}:: Timesteps: {timesteps}, Total Return: {return_ : .2f}")

if name == "main": asyncio.run(main()) `

This is the error that I receive: Traceback (most recent call last): File "/Users/punkboy/miniconda3/envs/pokeml/lib/python3.11/site-packages/poke_env/ps_client/ps_client.py", line 135, in _handle_message await self._handle_battle_message(split_messages) # type: ignore ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/punkboy/miniconda3/envs/pokeml/lib/python3.11/site-packages/poke_env/player/player.py", line 351, in _handle_battle_message await self._handle_battle_request(battle) File "/Users/punkboy/miniconda3/envs/pokeml/lib/python3.11/site-packages/poke_env/player/player.py", line 378, in _handle_battle_request message = message.message ^^^^^^^^^^^^^^^ AttributeError: 'function' object has no attribute 'message' Task exception was never retrieved future: <Task finished name='Task-45' coro=<PSClient._handle_message() done, defined at /Users/punkboy/miniconda3/envs/pokeml/lib/python3.11/site-packages/poke_env/ps_client/ps_client.py:121> exception=AttributeError("'function' object has no attribute 'message'")> Traceback (most recent call last): File "/Users/punkboy/miniconda3/envs/pokeml/lib/python3.11/site-packages/poke_env/ps_client/ps_client.py", line 190, in _handle_message raise exception File "/Users/punkboy/miniconda3/envs/pokeml/lib/python3.11/site-packages/poke_env/ps_client/ps_client.py", line 135, in _handle_message await self._handle_battle_message(split_messages) # type: ignore ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/punkboy/miniconda3/envs/pokeml/lib/python3.11/site-packages/poke_env/player/player.py", line 351, in _handle_battle_message await self._handle_battle_request(battle) File "/Users/punkboy/miniconda3/envs/pokeml/lib/python3.11/site-packages/poke_env/player/player.py", line 378, in _handle_battle_request message = message.message AttributeError: 'function' object has no attribute 'message'

I see that some people in the issues subclass from Player or OpenAIGymEnv instead of the classes that are open to Gymnasium. I would like to perform RL with pytorch so I prefer to subclass from EnvPlayer, however in none of my attempts at making my own agent have I been able to make them choose a move.

caymansimpson commented 1 month ago

I think your problem is in action_to_move where you're returning the function, and not calling it. Eg self.create_order()

This would explain the error message you're getting

alexrsanderson commented 1 month ago

I think your problem is in action_to_move where you're returning the function, and not calling it. Eg self.create_order()

This would explain the error message you're getting

Thanks for the reply, I tried calling the method and then received an error around my type chart where it said that the GenData is already initialized. I saw that it is already initialized within the Pokemon class but I am lost to where it occurs in my code or it is a bug within Poke-Env. I managed to get around it by simplifying the embed_battle and describe_embedding function and was able to make them choose a move.