crowdAI / marLo

Multi Agent Reinforcement Learning using MalmÖ
MIT License
244 stars 46 forks source link

Repeatedly train agent #75

Open martinv opened 5 years ago

martinv commented 5 years ago

I would like to train the agent throughout multiple episodes and force it to go back to initial position at the beginning of each episode. I am not able to reset the environment, however. The following 'pseudocode' ` num_episodes = 10 episode_len = 30

env = marlo.init(join_token)

for ep in range(num_episodes):

 print("Running episode {}".format(ep))

 observation = env.reset()
 done = False

 t_iter = 0
 while (not done) and (t_iter <= episode_len):
     print("  t iter = {}".format(t_iter))
     _action = env.action_space.sample()
     obs, reward, done, info = env.step(_action)

     if done:
         break

     t_iter += 1

`

seems to execute once and then hangs with the error message:

Running episode 1 WARNING:marlo.base_env_builder:Error on attempting to start mission : A mission is already running. WARNING:marlo.base_env_builder:Will attempt again after 3 seconds. repeated multiple times until the whole simulation is killed.

How do I force the agent to make a 'clean start' at the beginning of each training episode?

AndKram commented 5 years ago

Which mission are you running? It's certainly safer to loop until done and not attempt a reset while the mission is in progress. There is a quit message sent on reset but possibly the mission is not quitting. Does it work with episode_len very large?