Open beta1scat opened 1 year ago
I see that both the Full test - HPO
and Local - linux
tests on the home page are in a failed state. Is this related to this issue?
We are fixing this issue in v3.0 release.
We are fixing this issue in v3.0 release. Thanks for your reply, Is there an expected release time?
We are likely to release an alpha build including the fix in this week. The stable release will be released in about 2 weeks.
We are likely to release an alpha build including the fix in this week. The stable release will be released in about 2 weeks.
Thanks for your reply!
@liuzhe-lz Hi, could you tell when the stable version would be released? The alpha version seems to have bugs.
We are doing bug bash now and it will be released when all known bugs are fixed. What bugs you have encountered? Please give me a short description, thanks!
Describe the issue: Unable to operate stably.
Environment:
Configuration:
experiment = Experiment('local') experiment.id = '*' experiment.config.trial_command = 'python model.py' experiment.config.trial_code_directory = '.' experiment.config.search_space = search_space experiment.config.tuner.name = 'TPE' experiment.config.tuner.class_args['optimize_mode'] = 'maximize' experiment.config.max_trial_number = 5000 experiment.config.trial_concurrency = 4 experiment.run(58000) experiment.stop()
Log message:
[2023-03-21 10:08:52] INFO (nni.tuner.tpe/MainThread) Using random seed 1596889983 [2023-03-21 10:08:52] INFO (nni.runtime.msg_dispatcher_base/MainThread) Dispatcher started [2023-03-21 10:21:17] WARNING (nni.runtime.tuner_command_channel.channel/MainThread) Exception on receiving: ConnectionClosedError(None, None, None) [2023-03-21 10:21:17] WARNING (nni.runtime.tuner_command_channel.channel/MainThread) Connection lost. Trying to reconnect... [2023-03-21 10:21:17] INFO (nni.runtime.tuner_command_channel.channel/MainThread) Attempt #0, wait 0 seconds... [2023-03-21 10:21:17] INFO (nni.runtime.msg_dispatcher_base/MainThread) Report error to NNI manager: Traceback (most recent call last): File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 138, in read_http_response status_code, reason, headers = await read_response(self.reader) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/http.py", line 120, in read_response status_line = await read_line(stream) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/http.py", line 194, in read_line line = await stream.readline() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/streams.py", line 524, in readline line = await self.readuntil(sep) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/streams.py", line 616, in readuntil await self._wait_for_data('readuntil') File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data await self._waiter File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/selector_events.py", line 862, in _read_ready__data_received data = self._sock.recv(self.max_size) ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/main.py", line 61, in main dispatcher.run() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/msg_dispatcher_base.py", line 69, in run command, data = self._channel._receive() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/channel.py", line 94, in _receive command = self._retry_receive() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/channel.py", line 104, in _retry_receive self._channel.connect() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/websocket.py", line 62, in connect self._ws = _wait(_connect_async(self._url)) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/websocket.py", line 111, in _wait return future.result() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/concurrent/futures/_base.py", line 458, in result return self.get_result() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result raise self._exception File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/websocket.py", line 125, in _connect_async return await websockets.connect(url, max_size=None) # type: ignore File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 659, in await_impl_timeout return await asyncio.wait_for(self.await_impl(), self.open_timeout) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/tasks.py", line 445, in wait_for return fut.result() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 666, in await_impl await protocol.handshake( File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 326, in handshake status_code, response_headers = await self.read_http_response() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 144, in read_http_response raise InvalidMessage("did not receive a valid HTTP response") from exc websockets.exceptions.InvalidMessage: did not receive a valid HTTP response
[2023-03-21 10:21:17] WARNING (nni.runtime.tuner_command_channel.channel/MainThread) Exception on sending: AttributeError("'NoneType' object has no attribute 'send'") [2023-03-21 10:21:17] ERROR (nni.runtime.tuner_command_channel.channel/MainThread) 'NoneType' object has no attribute 'send' Traceback (most recent call last): File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 138, in read_http_response status_code, reason, headers = await read_response(self.reader) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/http.py", line 120, in read_response status_line = await read_line(stream) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/http.py", line 194, in read_line line = await stream.readline() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/streams.py", line 524, in readline line = await self.readuntil(sep) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/streams.py", line 616, in readuntil await self._wait_for_data('readuntil') File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data await self._waiter File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/selector_events.py", line 862, in _read_ready__data_received data = self._sock.recv(self.max_size) ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/main.py", line 61, in main dispatcher.run() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/msg_dispatcher_base.py", line 69, in run command, data = self._channel._receive() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/channel.py", line 94, in _receive command = self._retry_receive() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/channel.py", line 104, in _retry_receive self._channel.connect() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/websocket.py", line 62, in connect self._ws = _wait(_connect_async(self._url)) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/websocket.py", line 111, in _wait return future.result() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/concurrent/futures/_base.py", line 458, in result return self.get_result() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result raise self._exception File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/websocket.py", line 125, in _connect_async return await websockets.connect(url, max_size=None) # type: ignore File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 659, in await_impl_timeout return await asyncio.wait_for(self.await_impl(), self.open_timeout) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/tasks.py", line 445, in wait_for return fut.result() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 666, in await_impl await protocol.handshake( File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 326, in handshake status_code, response_headers = await self.read_http_response() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 144, in read_http_response raise InvalidMessage("did not receive a valid HTTP response") from exc websockets.exceptions.InvalidMessage: did not receive a valid HTTP response
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/channel.py", line 62, in _send self._channel.send(command) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/websocket.py", line 81, in send _wait(self._ws.send(message)) AttributeError: 'NoneType' object has no attribute 'send' [2023-03-21 10:21:17] WARNING (nni.runtime.tuner_command_channel.channel/MainThread) Connection lost. Trying to reconnect... [2023-03-21 10:21:17] INFO (nni.runtime.tuner_command_channel.channel/MainThread) Attempt #0, wait 0 seconds... [2023-03-21 10:21:17] ERROR (nni.runtime.msg_dispatcher_base/MainThread) Connection to NNI manager is broken. Failed to report error. [2023-03-21 10:21:17] ERROR (nni.main/MainThread) did not receive a valid HTTP response Traceback (most recent call last): File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 138, in read_http_response status_code, reason, headers = await read_response(self.reader) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/http.py", line 120, in read_response status_line = await read_line(stream) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/http.py", line 194, in read_line line = await stream.readline() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/streams.py", line 524, in readline line = await self.readuntil(sep) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/streams.py", line 616, in readuntil await self._wait_for_data('readuntil') File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data await self._waiter File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/selector_events.py", line 862, in _read_ready__data_received data = self._sock.recv(self.max_size) ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/main.py", line 85, in
main()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/main.py", line 61, in main
dispatcher.run()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/msg_dispatcher_base.py", line 69, in run
command, data = self._channel._receive()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/channel.py", line 94, in _receive
command = self._retry_receive()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/channel.py", line 104, in _retry_receive
self._channel.connect()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/websocket.py", line 62, in connect
self._ws = _wait(_connect_async(self._url))
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/websocket.py", line 111, in _wait
return future.result()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.get_result()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result
raise self._exception
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/websocket.py", line 125, in _connect_async
return await websockets.connect(url, max_size=None) # type: ignore
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 659, in await_impl_timeout
return await asyncio.wait_for(self.await_impl(), self.open_timeout)
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 666, in await_impl
await protocol.handshake(
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 326, in handshake
status_code, response_headers = await self.read_http_response()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 144, in read_http_response
raise InvalidMessage("did not receive a valid HTTP response") from exc
websockets.exceptions.InvalidMessage: did not receive a valid HTTP response
[2023-03-21 09:47:28] Creating experiment, Experiment ID: adapter_plate_square_TPE_quniform [2023-03-21 09:47:28] Starting web server... [2023-03-21 09:47:29] WARNING: Timeout, retry... [2023-03-21 09:47:30] Setting up... [2023-03-21 09:47:30] Web portal URLs: http://127.0.0.1:58000 http://10.62.137.83:58000 http://198.18.0.1:58000 node:events:504 throw er; // Unhandled 'error' event ^
Error: tuner_command_channel: Tuner loses responsive at WebSocketChannelImpl.heartbeat (/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni_node/core/tuner_command_channel/websocket_channel.js:119:30) at listOnTimeout (node:internal/timers:559:17) at processTimers (node:internal/timers:502:7) Emitted 'error' event at: at WebSocketChannelImpl.handleError (/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni_node/core/tuner_command_channel/websocket_channel.js:135:22) at WebSocketChannelImpl.heartbeat (/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni_node/core/tuner_command_channel/websocket_channel.js:119:18) at listOnTimeout (node:internal/timers:559:17) at processTimers (node:internal/timers:502:7) Thrown at: at heartbeat (/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni_node/core/tuner_command_channel/websocket_channel.js:119:30) at listOnTimeout (node:internal/timers:559:17) at processTimers (node:internal/timers:502:7) Traceback (most recent call last): File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 138, in read_http_response status_code, reason, headers = await read_response(self.reader) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/http.py", line 120, in read_response status_line = await read_line(stream) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/http.py", line 194, in read_line line = await stream.readline() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/streams.py", line 524, in readline line = await self.readuntil(sep) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/streams.py", line 616, in readuntil await self._wait_for_data('readuntil') File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data await self._waiter File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/selector_events.py", line 862, in _read_ready__data_received data = self._sock.recv(self.max_size) ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/niu/miniconda3/envs/halcon/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/niu/miniconda3/envs/halcon/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/main.py", line 85, in
main()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/main.py", line 61, in main
dispatcher.run()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/msg_dispatcher_base.py", line 69, in run
command, data = self._channel._receive()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/channel.py", line 94, in _receive
command = self._retry_receive()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/channel.py", line 104, in _retry_receive
self._channel.connect()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/websocket.py", line 62, in connect
self._ws = _wait(_connect_async(self._url))
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/websocket.py", line 111, in _wait
return future.result()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.get_result()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result
raise self._exception
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/runtime/tuner_command_channel/websocket.py", line 125, in _connect_async
return await websockets.connect(url, max_size=None) # type: ignore
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 659, in await_impl_timeout
return await asyncio.wait_for(self.await_impl(), self.open_timeout)
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 666, in await_impl
await protocol.handshake(
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 326, in handshake
status_code, response_headers = await self.read_http_response()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/websockets/legacy/client.py", line 144, in read_http_response
raise InvalidMessage("did not receive a valid HTTP response") from exc
websockets.exceptions.InvalidMessage: did not receive a valid HTTP response
Traceback (most recent call last):
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/connection.py", line 174, in _new_conn
conn = connection.create_connection(
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/util/connection.py", line 95, in create_connection
raise err
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/connectionpool.py", line 398, in _make_request conn.request(method, url, **httplib_request_kw) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/connection.py", line 239, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/http/client.py", line 1282, in request self._send_request(method, url, body, headers, encode_chunked) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/http/client.py", line 1328, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/http/client.py", line 1277, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/http/client.py", line 1037, in _send_output self.send(msg) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/http/client.py", line 975, in send self.connect() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/connection.py", line 205, in connect conn = self._new_conn() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f947595e2c0>: Failed to establish a new connection: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/requests/adapters.py", line 489, in send resp = conn.urlopen( File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=58000): Max retries exceeded with url: /api/v1/nni/check-status (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f947595e2c0>: Failed to establish a new connection: [Errno 111] Connection refused'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/niu/code/halcon/paramsearchhalcon/python/NNI/star_TPE/main.py", line 64, in
experiment.run(58000)
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/experiment/experiment.py", line 183, in run
self._wait_completion()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/experiment/experiment.py", line 163, in _wait_completion
status = self.get_status()
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/experiment/experiment.py", line 283, in get_status
resp = rest.get(self.port, '/check-status', self.url_prefix)
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/experiment/rest.py", line 43, in get
return request('get', port, api, prefix=prefix)
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/experiment/rest.py", line 31, in request
resp = requests.request(method, url, timeout=timeout)
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, kwargs)
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/requests/sessions.py", line 587, in request
resp = self.send(prep, send_kwargs)
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/requests/sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/requests/adapters.py", line 565, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=58000): Max retries exceeded with url: /api/v1/nni/check-status (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f947595e2c0>: Failed to establish a new connection: [Errno 111] Connection refused'))
[2023-03-21 09:53:09] Stopping experiment, please wait...
[2023-03-21 09:53:09] ERROR: HTTPConnectionPool(host='localhost', port=58000): Max retries exceeded with url: /api/v1/nni/experiment (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9519c34460>: Failed to establish a new connection: [Errno 111] Connection refused'))
Traceback (most recent call last):
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/connection.py", line 174, in _new_conn
conn = connection.create_connection(
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/util/connection.py", line 95, in create_connection
raise err
File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/connectionpool.py", line 398, in _make_request conn.request(method, url, **httplib_request_kw) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/connection.py", line 239, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/http/client.py", line 1282, in request self._send_request(method, url, body, headers, encode_chunked) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/http/client.py", line 1328, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/http/client.py", line 1277, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/http/client.py", line 1037, in _send_output self.send(msg) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/http/client.py", line 975, in send self.connect() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/connection.py", line 205, in connect conn = self._new_conn() File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f9519c34460>: Failed to establish a new connection: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/requests/adapters.py", line 489, in send resp = conn.urlopen( File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=58000): Max retries exceeded with url: /api/v1/nni/experiment (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9519c34460>: Failed to establish a new connection: [Errno 111] Connection refused'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/experiment/experiment.py", line 143, in _stop_impl rest.delete(self.port, '/experiment', self.url_prefix) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/experiment/rest.py", line 52, in delete request('delete', port, api, prefix=prefix) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/nni/experiment/rest.py", line 31, in request resp = requests.request(method, url, timeout=timeout) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/requests/api.py", line 59, in request return session.request(method=method, url=url, kwargs) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/requests/sessions.py", line 587, in request resp = self.send(prep, send_kwargs) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/requests/sessions.py", line 701, in send r = adapter.send(request, **kwargs) File "/home/niu/miniconda3/envs/halcon/lib/python3.10/site-packages/requests/adapters.py", line 565, in send raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=58000): Max retries exceeded with url: /api/v1/nni/experiment (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9519c34460>: Failed to establish a new connection: [Errno 111] Connection refused')) [2023-03-21 09:53:09] WARNING: Cannot gracefully stop experiment, killing NNI process... [2023-03-21 09:53:09] Experiment stopped