An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Describe the issue:
Hello, I encountered the following error while using NNI 2.10.1:
[2023-08-01 14:07:41] Creating experiment, Experiment ID: aj7wd2ey
[2023-08-01 14:07:41] Starting web server...
node:internal/modules/cjs/loader:1187
return process.dlopen(module, path.toNamespacedPath(filename));
^
Error: /lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by /home/zzdx/.local/lib/python3.10/site-packages/nni_node/node_modules/sqlite3/lib/binding/napi-v6-linux-glibc-x64/node_sqlite3.node)
at Object.Module._extensions..node (node:internal/modules/cjs/loader:1187:18)
at Module.load (node:internal/modules/cjs/loader:981:32)
at Function.Module._load (node:internal/modules/cjs/loader:822:12)
at Module.require (node:internal/modules/cjs/loader:1005:19)
at require (node:internal/modules/cjs/helpers:102:18)
at Object.<anonymous> (/home/zzdx/.local/lib/python3.10/site-packages/nni_node/node_modules/sqlite3/lib/sqlite3-binding.js:4:17)
at Module._compile (node:internal/modules/cjs/loader:1103:14)
at Object.Module._extensions..js (node:internal/modules/cjs/loader:1157:10)
at Module.load (node:internal/modules/cjs/loader:981:32)
at Function.Module._load (node:internal/modules/cjs/loader:822:12) {
code: 'ERR_DLOPEN_FAILED'
}
Thrown at:
at Module._extensions..node (node:internal/modules/cjs/loader:1187:18)
at Module.load (node:internal/modules/cjs/loader:981:32)
at Module._load (node:internal/modules/cjs/loader:822:12)
at Module.require (node:internal/modules/cjs/loader:1005:19)
at require (node:internal/modules/cjs/helpers:102:18)
at /home/zzdx/.local/lib/python3.10/site-packages/nni_node/node_modules/sqlite3/lib/sqlite3-binding.js:4:17
at Module._compile (node:internal/modules/cjs/loader:1103:14)
at Module._extensions..js (node:internal/modules/cjs/loader:1157:10)
at Module.load (node:internal/modules/cjs/loader:981:32)
at Module._load (node:internal/modules/cjs/loader:822:12)
[2023-08-01 14:07:42] WARNING: Timeout, retry...
[2023-08-01 14:07:43] WARNING: Timeout, retry...
[2023-08-01 14:07:44] ERROR: Create experiment failed
Traceback (most recent call last):
File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/connection.py", line 203, in _new_conn
sock = connection.create_connection(
File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
raise err
File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 790, in urlopen
response = self._make_request(
File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 496, in _make_request
conn.request(
File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/connection.py", line 395, in request
self.endheaders()
File "/usr/local/python3/lib/python3.10/http/client.py", line 1278, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/local/python3/lib/python3.10/http/client.py", line 1038, in _send_output
self.send(msg)
File "/usr/local/python3/lib/python3.10/http/client.py", line 976, in send
self.connect()
File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/connection.py", line 243, in connect
self.sock = self._new_conn()
File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/connection.py", line 218, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f2e65bc04c0>: Failed to establish a new connection: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/zzdx/.local/lib/python3.10/site-packages/requests/adapters.py", line 486, in send
resp = conn.urlopen(
File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 844, in urlopen
retries = retries.increment(
File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/util/retry.py", line 515, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=8080): Max retries exceeded with url: /api/v1/nni/check-status (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f2e65bc04c0>: Failed to establish a new connection: [Errno 111] Connection refused'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/zzdx/search_api/area_api/3_5_1126_1623_711_715-30min_20230801140738/main.py", line 51, in <module>
experiment.run(8080)
File "/home/zzdx/.local/lib/python3.10/site-packages/nni/experiment/experiment.py", line 180, in run
self.start(port, debug)
File "/home/zzdx/.local/lib/python3.10/site-packages/nni/experiment/experiment.py", line 135, in start
self._start_impl(port, debug, run_mode, None, [])
File "/home/zzdx/.local/lib/python3.10/site-packages/nni/experiment/experiment.py", line 103, in _start_impl
self._proc = launcher.start_experiment(self._action, self.id, config, port, debug, run_mode,
File "/home/zzdx/.local/lib/python3.10/site-packages/nni/experiment/launcher.py", line 148, in start_experiment
raise e
File "/home/zzdx/.local/lib/python3.10/site-packages/nni/experiment/launcher.py", line 126, in start_experiment
_check_rest_server(port, url_prefix=url_prefix)
File "/home/zzdx/.local/lib/python3.10/site-packages/nni/experiment/launcher.py", line 196, in _check_rest_server
rest.get(port, '/check-status', url_prefix)
File "/home/zzdx/.local/lib/python3.10/site-packages/nni/experiment/rest.py", line 43, in get
return request('get', port, api, prefix=prefix)
File "/home/zzdx/.local/lib/python3.10/site-packages/nni/experiment/rest.py", line 31, in request
resp = requests.request(method, url, timeout=timeout)
File "/home/zzdx/.local/lib/python3.10/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "/home/zzdx/.local/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/home/zzdx/.local/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/home/zzdx/.local/lib/python3.10/site-packages/requests/adapters.py", line 519, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8080): Max retries exceeded with url: /api/v1/nni/check-status (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f2e65bc04c0>: Failed to establish a new connection: [Errno 111] Connection refused'))
[2023-08-01 14:07:44] Stopping experiment, please wait...
[2023-08-01 14:07:44] Experiment stopped
Environment:
NNI version: 2.10.1
Training service (local|remote|pai|aml|etc): remote
Describe the issue: Hello, I encountered the following error while using NNI 2.10.1:
Environment:
Log message: