automl / HpBandSter

a distributed Hyperband implementation on Steroids
BSD 3-Clause "New" or "Revised" License
611 stars 109 forks source link

Local worker failing before the end of hyperband process #77

Open Rayn2402 opened 4 years ago

Rayn2402 commented 4 years ago

I currently try to run a job and I get the following error juste before I'm finished: Exception in thread oneway-call:

  File "//anaconda3/envs/AutoML_env/lib/python3.7/site-packages/Pyro4/core.py", line 515, in connect_and_handshake
    sslContext=sslContext)
  File "//anaconda3/envs/AutoML_env/lib/python3.7/site-packages/Pyro4/socketutil.py", line 307, in createSocket
    sock.connect(connect)
OSError: [Errno 22] Invalid argument

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "//anaconda3/envs/AutoML_env/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "//anaconda3/envs/AutoML_env/lib/python3.7/site-packages/Pyro4/core.py", line 1891, in run
    super(_OnewayCallThread, self).run()
  File "//anaconda3/envs/AutoML_env/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "//anaconda3/envs/AutoML_env/lib/python3.7/site-packages/hpbandster/core/worker.py", line 215, in start_computation
    callback.register_result(id, result)
  File "//anaconda3/envs/AutoML_env/lib/python3.7/site-packages/Pyro4/core.py", line 185, in __call__
    return self.__send(self.__name, args, kwargs)
  File "//anaconda3/envs/AutoML_env/lib/python3.7/site-packages/Pyro4/core.py", line 428, in _pyroInvoke
    self.__pyroCreateConnection()
  File "//anaconda3/envs/AutoML_env/lib/python3.7/site-packages/Pyro4/core.py", line 596, in __pyroCreateConnection
    connect_and_handshake(conn)
  File "//anaconda3/envs/AutoML_env/lib/python3.7/site-packages/Pyro4/core.py", line 549, in connect_and_handshake
    raise ce
Pyro4.errors.CommunicationError: cannot connect to ('localhost', 58540): [Errno 22] Invalid argument 

Could someone help me? Thank you!

maxhuettenrauch commented 4 years ago

Hey, I'm running into the same issue. Have you figured out what causes it already?

Rayn2402 commented 4 years ago

No sorry, I haven't worked with hpbandster for few months now. The main thing I remember about this problem is that I encountered it only when I was working with a very high epochs budget.