MIT-TESSE / goseek-challenge

Instructions for competing in the GOSEEK challenge at ICRA 2020
67 stars 16 forks source link

Broken Pipe Error #9

Closed achuwilson closed 4 years ago

achuwilson commented 4 years ago

When calling the env.step(action), it works most of the time, but once in a while it returns an error

~/miniconda3/envs/goseek/lib/python3.7/site-packages/stable_baselines/common/vec_env/base_vec_env.py in step(self, actions)
    147         :return: ([int] or [float], [float], [bool], dict) observation, reward, done, information
    148         """
--> 149         self.step_async(actions)
    150         return self.step_wait()
    151 

~/miniconda3/envs/goseek/lib/python3.7/site-packages/stable_baselines/common/vec_env/subproc_vec_env.py in step_async(self, actions)
    101     def step_async(self, actions):
    102         for remote, action in zip(self.remotes, actions):
--> 103             remote.send(('step', action))
    104         self.waiting = True
    105 

~/miniconda3/envs/goseek/lib/python3.7/multiprocessing/connection.py in send(self, obj)
    204         self._check_closed()
    205         self._check_writable()
--> 206         self._send_bytes(_ForkingPickler.dumps(obj))
    207 
    208     def recv_bytes(self, maxlength=None):

~/miniconda3/envs/goseek/lib/python3.7/multiprocessing/connection.py in _send_bytes(self, buf)
    402             # Also note we want to avoid sending a 0-length buffer separately,
    403             # to avoid "broken pipe" errors if the other end closed the pipe.
--> 404             self._send(header + buf)
    405 
    406     def _recv_bytes(self, maxsize=None):

~/miniconda3/envs/goseek/lib/python3.7/multiprocessing/connection.py in _send(self, buf, write)
    366         remaining = len(buf)
    367         while True:
--> 368             n = write(self._handle, buf)
    369             remaining -= n
    370             if remaining == 0:

BrokenPipeError: [Errno 32] Broken pipe
ZacRavichandran commented 4 years ago

Unfortunately that's somewhat of a catch-all error for stable baseline multi processing environments, but I'm guessing there was a network timeout issue. How many parallel environments were running?

If you want to confirm the underlying error, you can replace SubprocVecEnv with DummyVecEnv in the baseline's make_unity_env function. DummyVecEnv runs environments in a single rather than through multiple processes, so this will be a little slower. However, it will propagate the original exception.

achuwilson commented 4 years ago

I was running a single environment with the SubprocVecEnv. I will now try with the DummyVecEnv too.

ZacRavichandran commented 4 years ago

I wanted to follow up on this thread - did DummyVecEnv help debug the issue?