Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
16.93k stars 4.13k forks source link

Calling env.close() once prevents instantiation of new environments #1167

Closed renatolfc closed 5 years ago

renatolfc commented 6 years ago

Although this is a problem that happens with any environment, let's consider the Banana Collector environment, part of the Udacity Deep Reinforcement Learning Nanodegree.

Assume we instantiate such an environment with

from unityagents import UnityEnvironment

env = UnityEnvironment(file_name="Banana_Linux/Banana.x86_64")

Now, after we're done with an experiment, we might want to close the environment to free up some resources. So, we call env.close().

At this point, we should be able to instantiate a new environment. Instead, we get an exception, probably because the grpc server (?) went down.

Minimum Working Example:

from unityagents import UnityEnvironment

env = UnityEnvironment(file_name="Banana_Linux/Banana.x86_64", worker_id=1)

env = UnityEnvironment(file_name="Banana_Linux/Banana.x86_64", worker_id=2)
python issue.py
Found path: /home/renatoc/repos/deep-reinforcement-learning-navigation/Banana_Linux/Banana.x86_64
Mono path[0] = '/home/renatoc/repos/deep-reinforcement-learning-navigation/Banana_Linux/Banana_Data/Managed'
Mono config path = '/home/renatoc/repos/deep-reinforcement-learning-navigation/Banana_Linux/Banana_Data/MonoBleedingEdge/etc'
Preloaded 'ScreenSelector.so'
Preloaded 'libgrpc_csharp_ext.x64.so'
Unable to preload the following plugins:
Logging to /home/renatoc/.config/unity3d/Unity Technologies/Unity Environment/Player.log
'Academy' started successfully!
Unity Academy name: Academy
        Number of Brains: 1
        Number of External Brains : 1
        Lesson number : 0
        Reset Parameters :

Unity brain name: BananaBrain
        Number of Visual Observations (per agent): 0
        Vector Observation space type: continuous
        Vector Observation space size (per agent): 37
        Number of stacked Vector Observation: 1
        Vector Action space type: discrete
        Vector Action space size (per agent): 4
        Vector Action descriptions: , , ,
Traceback (most recent call last):
  File "issue.py", line 6, in <module>
    env = UnityEnvironment(file_name="Banana_Linux/Banana.x86_64", worker_id=2)
  File "/home/renatoc/venvs/drlnd/lib/python3.7/site-packages/unityagents/environment.py", line 64, in __init__
    aca_params = self.send_academy_parameters(rl_init_parameters_in)
  File "/home/renatoc/venvs/drlnd/lib/python3.7/site-packages/unityagents/environment.py", line 505, in send_academy_parameters
    return self.communicator.initialize(inputs).rl_initialization_output
  File "/home/renatoc/venvs/drlnd/lib/python3.7/site-packages/unityagents/rpc_communicator.py", line 58, in initialize
    if not self.unity_to_external.parent_conn.poll(30):
  File "/usr/lib64/python3.7/multiprocessing/connection.py", line 255, in poll
  File "/usr/lib64/python3.7/multiprocessing/connection.py", line 136, in _check_closed
    raise OSError("handle is closed")
OSError: handle is closed
ivallesp commented 5 years ago

Same happens to me

vincentpierre commented 5 years ago

Does this bug still exist in v0.5 ?

ivallesp commented 5 years ago

I have the last version and it is happening...

vincentpierre commented 5 years ago

Could you post minimal code to reproduce this bug ? unityagents has been replaced by mlagents.envs so the previous code and error were using a previous version of ml-agents.

cipher813 commented 5 years ago

Hi Vincent, I am having the same issue in that I cannot loop multiple environments. I am importing from unityagents as the environment needs to be compatible with ml-agents v0.4. My objective is to be able to loop through different environments (such as unity and gym, or varying number of agents) and policies (ie DDPG, D4PG).

Minimal code to reproduce bug is as follows:

`import time import numpy as np from unityagents import UnityEnvironment

from agents.DDPG import DDPG

from util import *

PATH = "/Volumes/BC_Clutch/Dropbox/Programming/Classes/Udacity/DeepRLND/rl_continuous_control/"

env_dict = { "Reacher1":"Reacher1.app", "Reacher20":"Reacher20.app" }

result_dict = {} for ke, ve in env_dict.items(): start = time.time() total_scores = [] env_name = ke print(f"Environment: {env_name}") fp = PATH + f"data/{ve}" env = UnityEnvironment(file_name=fp) brain_name = env.brain_names[0] brain = env.brains[brain_name] env_info = env.reset(train_mode=True)[brain_name] num_agents = len(env_info.agents) action_size = brain.vector_action_space_size state_size = env_info.vector_observations.shape[1] agent = DDPG(state_size, action_size, num_agents) for i in range(1,10000): env_info = env.reset(train_mode=True)[brain_name] states = env_info.vector_observations scores = np.zeros(num_agents) agent.reset() for t in range(1000): actions = agent.act(states) env_info = env.step(actions)[brain_name] next_states = env_info.vector_observations rewards = env_info.rewards dones = env_info.local_done agent.step(states, actions, rewards, next_states, dones, t) states = next_states scores += env_info.rewards if np.any(dones): break length = min(100, len(scores)) mean_score = np.mean(scores) total_scores.append(mean_score) total_mean_score = np.mean(total_scores[-length:]) print(f"\rEpisode: {i}\tScore: {mean_score:.2f}") if total_mean_score>0.05: print(f"Solved in {i} episodes.") break end = time.time() result_dict[env_name] = { "Scores": total_scores, "Runtime": calc_runtime(end-start) } env.close() result_dict`

Error message is as follows:

`--------------------------------------------------------------------------- OSError Traceback (most recent call last)

in 21 print(f"Environment: {env_name}") 22 fp = PATH + f"data/{ve}" ---> 23 env = UnityEnvironment(file_name=fp) 24 brain_name = env.brain_names[0] 25 brain = env.brains[brain_name] ~/anaconda3/envs/drlnd2/lib/python3.6/site-packages/unityagents/environment.py in __init__(self, file_name, worker_id, base_port, curriculum, seed, docker_training, no_graphics) 62 ) 63 try: ---> 64 aca_params = self.send_academy_parameters(rl_init_parameters_in) 65 except UnityTimeOutException: 66 self._close() ~/anaconda3/envs/drlnd2/lib/python3.6/site-packages/unityagents/environment.py in send_academy_parameters(self, init_parameters) 503 inputs = UnityInput() 504 inputs.rl_initialization_input.CopyFrom(init_parameters) --> 505 return self.communicator.initialize(inputs).rl_initialization_output 506 507 def wrap_unity_input(self, rl_input: UnityRLInput) -> UnityOutput: ~/anaconda3/envs/drlnd2/lib/python3.6/site-packages/unityagents/rpc_communicator.py in initialize(self, inputs) 56 "You may need to manually close a previously opened environment " 57 "or use a different worker number.".format(str(self.worker_id))) ---> 58 if not self.unity_to_external.parent_conn.poll(30): 59 raise UnityTimeOutException( 60 "The Unity environment took too long to respond. Make sure that :\n" ~/anaconda3/envs/drlnd2/lib/python3.6/multiprocessing/connection.py in poll(self, timeout) 253 def poll(self, timeout=0.0): 254 """Whether there is any input available to be read""" --> 255 self._check_closed() 256 self._check_readable() 257 return self._poll(timeout) ~/anaconda3/envs/drlnd2/lib/python3.6/multiprocessing/connection.py in _check_closed(self) 134 def _check_closed(self): 135 if self._handle is None: --> 136 raise OSError("handle is closed") 137 138 def _check_readable(self): OSError: handle is closed ERROR:root:Exception calling application: [Errno 32] Broken pipe Traceback (most recent call last): File "/Users/brianmcmahon/anaconda3/envs/drlnd2/lib/python3.6/site-packages/grpc/_server.py", line 385, in _call_behavior return behavior(argument, context), True File "/Users/brianmcmahon/anaconda3/envs/drlnd2/lib/python3.6/site-packages/unityagents/rpc_communicator.py", line 25, in Exchange self.child_conn.send(request) File "/Users/brianmcmahon/anaconda3/envs/drlnd2/lib/python3.6/multiprocessing/connection.py", line 206, in send self._send_bytes(_ForkingPickler.dumps(obj)) File "/Users/brianmcmahon/anaconda3/envs/drlnd2/lib/python3.6/multiprocessing/connection.py", line 404, in _send_bytes self._send(header + buf) File "/Users/brianmcmahon/anaconda3/envs/drlnd2/lib/python3.6/multiprocessing/connection.py", line 368, in _send n = write(self._handle, buf) BrokenPipeError: [Errno 32] Broken pipe` If I remove env.close() from the code above, I get the following error message when looping to the second environment: `ERROR:root:Exception calling application: Ran out of input Traceback (most recent call last): File "/Users/brianmcmahon/anaconda3/envs/drlnd2/lib/python3.6/site-packages/grpc/_server.py", line 385, in _call_behavior return behavior(argument, context), True File "/Users/brianmcmahon/anaconda3/envs/drlnd2/lib/python3.6/site-packages/unityagents/rpc_communicator.py", line 26, in Exchange return self.child_conn.recv() File "/Users/brianmcmahon/anaconda3/envs/drlnd2/lib/python3.6/multiprocessing/connection.py", line 251, in recv return _ForkingPickler.loads(buf.getbuffer()) EOFError: Ran out of input` Appreciate your guidance!
aadimator commented 5 years ago

getting the same error. Any workaround?

vincentpierre commented 5 years ago

There is OS specific limitation on Linux in that you cannot reuse a port right after you close it. The delay before it becomes available again is something like 60 seconds. If you want to open multiple environments one after the other on Linux, consider changing ports every time and re-use ports only after a certain delay. There might be bugs related to closing environments independent of this Linux limitation on previous versions of ML-Agents. Since ML-Agents is still in Beta, we only support the most recent version. If you found a bug, please use the bug report template which must include steps to reproduce the bug. Thank you!

AntonioSerrano commented 4 years ago

@vincentpierre, I tried to change the port on my second call to the environment like this env = UnityEnvironment(file_name="./Banana.app", worker_id=2) But it did not work. Any idea? I'm using Mac OS, not Linux by the way. Thx

vincentpierre commented 4 years ago

Can you open a new issue using the new template? If this is a bug, we will need to be able to reproduce the bug to help you.

elexira commented 3 years ago

can somebody address this issue ? why was this closed ?

github-actions[bot] commented 3 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.