Closed renatolfc closed 5 years ago
Same happens to me
Does this bug still exist in v0.5 ?
I have the last version and it is happening...
Could you post minimal code to reproduce this bug ? unityagents
has been replaced by mlagents.envs
so the previous code and error were using a previous version of ml-agents.
Hi Vincent, I am having the same issue in that I cannot loop multiple environments. I am importing from unityagents as the environment needs to be compatible with ml-agents v0.4. My objective is to be able to loop through different environments (such as unity and gym, or varying number of agents) and policies (ie DDPG, D4PG).
Minimal code to reproduce bug is as follows:
`import time import numpy as np from unityagents import UnityEnvironment
from agents.DDPG import DDPG
from util import *
PATH = "/Volumes/BC_Clutch/Dropbox/Programming/Classes/Udacity/DeepRLND/rl_continuous_control/"
env_dict = { "Reacher1":"Reacher1.app", "Reacher20":"Reacher20.app" }
result_dict = {} for ke, ve in env_dict.items(): start = time.time() total_scores = [] env_name = ke print(f"Environment: {env_name}") fp = PATH + f"data/{ve}" env = UnityEnvironment(file_name=fp) brain_name = env.brain_names[0] brain = env.brains[brain_name] env_info = env.reset(train_mode=True)[brain_name] num_agents = len(env_info.agents) action_size = brain.vector_action_space_size state_size = env_info.vector_observations.shape[1] agent = DDPG(state_size, action_size, num_agents) for i in range(1,10000): env_info = env.reset(train_mode=True)[brain_name] states = env_info.vector_observations scores = np.zeros(num_agents) agent.reset() for t in range(1000): actions = agent.act(states) env_info = env.step(actions)[brain_name] next_states = env_info.vector_observations rewards = env_info.rewards dones = env_info.local_done agent.step(states, actions, rewards, next_states, dones, t) states = next_states scores += env_info.rewards if np.any(dones): break length = min(100, len(scores)) mean_score = np.mean(scores) total_scores.append(mean_score) total_mean_score = np.mean(total_scores[-length:]) print(f"\rEpisode: {i}\tScore: {mean_score:.2f}") if total_mean_score>0.05: print(f"Solved in {i} episodes.") break end = time.time() result_dict[env_name] = { "Scores": total_scores, "Runtime": calc_runtime(end-start) } env.close() result_dict`
Error message is as follows:
`--------------------------------------------------------------------------- OSError Traceback (most recent call last)
getting the same error. Any workaround?
There is OS specific limitation on Linux in that you cannot reuse a port right after you close it. The delay before it becomes available again is something like 60 seconds. If you want to open multiple environments one after the other on Linux, consider changing ports every time and re-use ports only after a certain delay. There might be bugs related to closing environments independent of this Linux limitation on previous versions of ML-Agents. Since ML-Agents is still in Beta, we only support the most recent version. If you found a bug, please use the bug report template which must include steps to reproduce the bug. Thank you!
@vincentpierre, I tried to change the port on my second call to the environment like this
env = UnityEnvironment(file_name="./Banana.app", worker_id=2)
But it did not work. Any idea? I'm using Mac OS, not Linux by the way. Thx
Can you open a new issue using the new template? If this is a bug, we will need to be able to reproduce the bug to help you.
can somebody address this issue ? why was this closed ?
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Although this is a problem that happens with any environment, let's consider the Banana Collector environment, part of the Udacity Deep Reinforcement Learning Nanodegree.
Assume we instantiate such an environment with
Now, after we're done with an experiment, we might want to close the environment to free up some resources. So, we call
env.close()
.At this point, we should be able to instantiate a new environment. Instead, we get an exception, probably because the grpc server (?) went down.
Minimum Working Example: