microsoft / nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
https://nni.readthedocs.io
MIT License
14k stars 1.81k forks source link

Error: /lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found #5653

Closed yifan-dadada closed 1 year ago

yifan-dadada commented 1 year ago

Describe the issue: Hello, I encountered the following error while using NNI 2.10.1:

[2023-08-01 14:07:41] Creating experiment, Experiment ID: aj7wd2ey
[2023-08-01 14:07:41] Starting web server...
node:internal/modules/cjs/loader:1187
  return process.dlopen(module, path.toNamespacedPath(filename));
                 ^

Error: /lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by /home/zzdx/.local/lib/python3.10/site-packages/nni_node/node_modules/sqlite3/lib/binding/napi-v6-linux-glibc-x64/node_sqlite3.node)
    at Object.Module._extensions..node (node:internal/modules/cjs/loader:1187:18)
    at Module.load (node:internal/modules/cjs/loader:981:32)
    at Function.Module._load (node:internal/modules/cjs/loader:822:12)
    at Module.require (node:internal/modules/cjs/loader:1005:19)
    at require (node:internal/modules/cjs/helpers:102:18)
    at Object.<anonymous> (/home/zzdx/.local/lib/python3.10/site-packages/nni_node/node_modules/sqlite3/lib/sqlite3-binding.js:4:17)
    at Module._compile (node:internal/modules/cjs/loader:1103:14)
    at Object.Module._extensions..js (node:internal/modules/cjs/loader:1157:10)
    at Module.load (node:internal/modules/cjs/loader:981:32)
    at Function.Module._load (node:internal/modules/cjs/loader:822:12) {
  code: 'ERR_DLOPEN_FAILED'
}
Thrown at:
    at Module._extensions..node (node:internal/modules/cjs/loader:1187:18)
    at Module.load (node:internal/modules/cjs/loader:981:32)
    at Module._load (node:internal/modules/cjs/loader:822:12)
    at Module.require (node:internal/modules/cjs/loader:1005:19)
    at require (node:internal/modules/cjs/helpers:102:18)
    at /home/zzdx/.local/lib/python3.10/site-packages/nni_node/node_modules/sqlite3/lib/sqlite3-binding.js:4:17
    at Module._compile (node:internal/modules/cjs/loader:1103:14)
    at Module._extensions..js (node:internal/modules/cjs/loader:1157:10)
    at Module.load (node:internal/modules/cjs/loader:981:32)
    at Module._load (node:internal/modules/cjs/loader:822:12)
[2023-08-01 14:07:42] WARNING: Timeout, retry...
[2023-08-01 14:07:43] WARNING: Timeout, retry...
[2023-08-01 14:07:44] ERROR: Create experiment failed
Traceback (most recent call last):
  File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/connection.py", line 203, in _new_conn
    sock = connection.create_connection(
  File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
    raise err
  File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 790, in urlopen
    response = self._make_request(
  File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 496, in _make_request
    conn.request(
  File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/connection.py", line 395, in request
    self.endheaders()
  File "/usr/local/python3/lib/python3.10/http/client.py", line 1278, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/python3/lib/python3.10/http/client.py", line 1038, in _send_output
    self.send(msg)
  File "/usr/local/python3/lib/python3.10/http/client.py", line 976, in send
    self.connect()
  File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/connection.py", line 243, in connect
    self.sock = self._new_conn()
  File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/connection.py", line 218, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f2e65bc04c0>: Failed to establish a new connection: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/zzdx/.local/lib/python3.10/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 844, in urlopen
    retries = retries.increment(
  File "/home/zzdx/.local/lib/python3.10/site-packages/urllib3/util/retry.py", line 515, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=8080): Max retries exceeded with url: /api/v1/nni/check-status (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f2e65bc04c0>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/zzdx/search_api/area_api/3_5_1126_1623_711_715-30min_20230801140738/main.py", line 51, in <module>
    experiment.run(8080)
  File "/home/zzdx/.local/lib/python3.10/site-packages/nni/experiment/experiment.py", line 180, in run
    self.start(port, debug)
  File "/home/zzdx/.local/lib/python3.10/site-packages/nni/experiment/experiment.py", line 135, in start
    self._start_impl(port, debug, run_mode, None, [])
  File "/home/zzdx/.local/lib/python3.10/site-packages/nni/experiment/experiment.py", line 103, in _start_impl
    self._proc = launcher.start_experiment(self._action, self.id, config, port, debug, run_mode,
  File "/home/zzdx/.local/lib/python3.10/site-packages/nni/experiment/launcher.py", line 148, in start_experiment
    raise e
  File "/home/zzdx/.local/lib/python3.10/site-packages/nni/experiment/launcher.py", line 126, in start_experiment
    _check_rest_server(port, url_prefix=url_prefix)
  File "/home/zzdx/.local/lib/python3.10/site-packages/nni/experiment/launcher.py", line 196, in _check_rest_server
    rest.get(port, '/check-status', url_prefix)
  File "/home/zzdx/.local/lib/python3.10/site-packages/nni/experiment/rest.py", line 43, in get
    return request('get', port, api, prefix=prefix)
  File "/home/zzdx/.local/lib/python3.10/site-packages/nni/experiment/rest.py", line 31, in request
    resp = requests.request(method, url, timeout=timeout)
  File "/home/zzdx/.local/lib/python3.10/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/zzdx/.local/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/zzdx/.local/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/home/zzdx/.local/lib/python3.10/site-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8080): Max retries exceeded with url: /api/v1/nni/check-status (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f2e65bc04c0>: Failed to establish a new connection: [Errno 111] Connection refused'))
[2023-08-01 14:07:44] Stopping experiment, please wait...
[2023-08-01 14:07:44] Experiment stopped

Environment:

Log message:

[2023-08-01 14:38:48] INFO (nni.experiment) Creating experiment, Experiment ID: s1dz03t2
[2023-08-01 14:38:48] INFO (nni.experiment) Starting web server...
[2023-08-01 14:38:49] WARNING (nni.experiment) Timeout, retry...
[2023-08-01 14:38:50] WARNING (nni.experiment) Timeout, retry...
[2023-08-01 14:38:51] ERROR (nni.experiment) Create experiment failed
[2023-08-01 14:38:51] INFO (nni.experiment) Stopping experiment, please wait...