Open TayyabaZainab0807 opened 1 year ago
Please pin typeguard to v2.x with pip install 'typeguard<3'
,
or upgrade NNI to v3.0 test version with pip install --extra-index-url https://test.pypi.org/simple/ nni==3.0b1
.
Please pin typeguard to v2.x with
pip install 'typeguard<3'
,
nni2.10 with typegurad<3 works but nni2.x has this issue https://github.com/microsoft/nni/issues/5531 so If I move to nni3.0b1, it still gives me failures: [ 'cuda_cores: Function Not Found', 'process: Function Not Found' ] with latest typeguard
Please provide the version of nvidia-ml-py (pip list
), nvidia driver, and cuda.
The error should be reproducible with following script, please check its output.
from pynvml import *
nvmlInit()
device = nvmlDeviceGetHandleByIndex(0)
cuda_cores = nvmlDeviceGetNumGpuCores(device)
print(cuda_cores)
nvmlShutdown()
Seems relative to this issue: https://github.com/NVIDIA/k8s-device-plugin/issues/331 They suggest to upgrade nvidia driver.
After some investigation I found the real error is another one. I will push a fix later today.
cuda_cores = nvmlDeviceGetNumGpuCores(device)
I have this version for nvidia-ml-py = 11.525.112
While running this script I get this error pynvml.NVMLError_FunctionNotFound: Function Not Found
Please try out 3.0b2 The NVML error is non-critical and can be ignored.
any updates for it? @TayyabaZainab0807
Describe the issue: The nni process is not running with nni3.0b1. I also tried a more stable nni versions (2.10 and 2.8) I get the following error:
Environment:
Configuration:
maxExperimentDuration: 156h maxTrialNumber: 200 tuner: name: TPE classArgs: optimize_mode: maximize trainingService: platform: local useActiveGpu: True
{ "en_decoder": { "_type": "choice", "_value": [7,8,9] }, "k1" : { "_type": "choice", "_value": [3,5,7,9,11] }, "k2" : { "_type": "choice", "_value": [3,5,7,9,11] }, "k3" : { "_type": "choice", "_value": [3,5,7,9,11] }, "k4" : { "_type": "choice", "_value": [3,5,7,9,11] }, "k5" : { "_type": "choice", "_value": [3,5,7,9,11] }, "k6" : { "_type": "choice", "_value": [3,5,7,9,11] }, "k7" : { "_type": "choice", "_value": [3,5,7,9,11] }, "k8" : { "_type": "choice", "_value": [3,5,7,9,11] }, "k9" : { "_type": "choice", "_value": [3,5,7,9,11] }, "f1": { "_type": "choice", "_value": [8,16,32] }, "f2": { "_type": "choice", "_value": [8,16,32] }, "f3": { "_type": "choice", "_value": [8,16,32] }, "f4": { "_type": "choice", "_value": [8,16,32] }, "f5": { "_type": "choice", "_value": [8,16,32] }, "f6": { "_type": "choice", "_value": [8,16,32] }, "f7": { "_type": "choice", "_value": [8,16,32] }, "f8": { "_type": "choice", "_value": [8,16,32] }, "f9": { "_type": "choice", "_value": [8,16,32] },
}
[2023-05-04 12:04:13] INFO (main) Start NNI manager [2023-05-04 12:04:13] INFO (RestServer) Starting REST server at port 8080, URL prefix: "/" [2023-05-04 12:04:13] INFO (RestServer) REST server started. [2023-05-04 12:04:13] INFO (NNIDataStore) Datastore initialization done [2023-05-04 12:04:14] INFO (NNIManager) Starting experiment: yajeqwud [2023-05-04 12:04:14] INFO (NNIManager) Setup training service... [2023-05-04 12:04:14] INFO (NNIManager) Setup tuner... [2023-05-04 12:04:14] INFO (NNIManager) Change NNIManager status from: INITIALIZED to: RUNNING [2023-05-04 12:04:14] INFO (NNIManager) Add event listeners [2023-05-04 12:04:14] INFO (LocalV3.local) Start [2023-05-04 12:04:14] INFO (NNIManager) NNIManager received command from dispatcher: ID, [2023-05-04 12:04:14] INFO (NNIManager) NNIManager received command from dispatcher: TR, {"parameter_id": 0, "parameter_source": "algorithm", "parameters": {"en_decoder": 8, "k1": 9, "k2": 11, "k3": 5, "k4": 11, "k5": 3, "k6": 5, "k7": 9, "k8": 5, "k9": 7, "f1": 32, "f2": 16, "f3": 16, "f4": 32, "f5": 16, "f6": 16, "f7": 8, "f8": 16, "f9": 16, "res_cnn": 3, "res_f1": 32, "res_f2": 32, "res_f3": 16, "res_k1": 5, "res_k2": 5, "res_k3": 3, "res_drop1": 0.15125745390112305, "res_drop2": 0.21885863079171017, "res_drop3": 0.19313110293876518, "bilstm": 2, "u1": 16, "u2": 8, "drop": 0.2758735965780924, "pu": 8, "su": 16, "batch_size": 80, "epochs": 15}, "parameter_index": 0} [2023-05-04 12:04:15] INFO (NNIManager) submitTrialJob: form: { sequenceId: 0, hyperParameters: { value: '{"parameter_id": 0, "parameter_source": "algorithm", "parameters": {"en_decoder": 8, "k1": 9, "k2": 11, "k3": 5, "k4": 11, "k5": 3, "k6": 5, "k7": 9, "k8": 5, "k9": 7, "f1": 32, "f2": 16, "f3": 16, "f4": 32, "f5": 16, "f6": 16, "f7": 8, "f8": 16, "f9": 16, "res_cnn": 3, "res_f1": 32, "res_f2": 32, "res_f3": 16, "res_k1": 5, "res_k2": 5, "res_k3": 3, "res_drop1": 0.15125745390112305, "res_drop2": 0.21885863079171017, "res_drop3": 0.19313110293876518, "bilstm": 2, "u1": 16, "u2": 8, "drop": 0.2758735965780924, "pu": 8, "su": 16, "batch_size": 80, "epochs": 15}, "parameter_index": 0}', index: 0 }, placementConstraint: { type: 'None', gpus: [] } } [2023-05-04 12:04:15] INFO (GpuInfoCollector) Forced update: { gpuNumber: 1, driverVersion: '470.182.03', cudaVersion: 11060, gpus: [ { index: 0, model: 'NVIDIA A100-SXM4-80GB', gpuMemory: 85198045184, freeGpuMemory: 85197914112, gpuCoreUtilization: 0, gpuMemoryUtilization: 0 } ], processes: [], success: true, failures: [ 'cuda_cores: Function Not Found', 'process: Function Not Found' ] } [2023-05-04 12:04:17] INFO (LocalV3.local) Register directory trial_code = /app
[2023-05-04 13:04:14] INFO (nni.tuner.tpe/MainThread) Using random seed 2140802229 [2023-05-04 13:04:14] INFO (nni.runtime.msg_dispatcher_base/MainThread) Dispatcher started [2023-05-04 13:04:14] INFO (nni.runtime.msg_dispatcher/Thread-1 (command_queue_worker)) Initial search space: {'en_decoder': {'_type': 'choice', '_value': [7, 8, 9]}, 'k1': {'_type': 'choice', '_value': [3, 5, 7, 9, 11]}, 'k2': {'_type': 'choice', '_value': [3, 5, 7, 9, 11]}, 'k3': {'_type': 'choice', '_value': [3, 5, 7, 9, 11]}, 'k4': {'_type': 'choice', '_value': [3, 5, 7, 9, 11]}, 'k5': {'_type': 'choice', '_value': [3, 5, 7, 9, 11]}, 'k6': {'_type': 'choice', '_value': [3, 5, 7, 9, 11]}, 'k7': {'_type': 'choice', '_value': [3, 5, 7, 9, 11]}, 'k8': {'_type': 'choice', '_value': [3, 5, 7, 9, 11]}, 'k9': {'_type': 'choice', '_value': [3, 5, 7, 9, 11]}, 'f1': {'_type': 'choice', '_value': [8, 16, 32]}, 'f2': {'_type': 'choice', '_value': [8, 16, 32]}, 'f3': {'_type': 'choice', '_value': [8, 16, 32]}, 'f4': {'_type': 'choice', '_value': [8, 16, 32]}, 'f5': {'_type': 'choice', '_value': [8, 16, 32]}, 'f6': {'_type': 'choice', '_value': [8, 16, 32]}, 'f7': {'_type': 'choice', '_value': [8, 16, 32]}, 'f8': {'_type': 'choice', '_value': [8, 16, 32]}, 'f9': {'_type': 'choice', '_value': [8, 16, 32]}, 'res_cnn': {'_type': 'choice', '_value': [1, 2, 3]}, 'res_f1': {'_type': 'choice', '_value': [8, 16, 32]}, 'res_f2': {'_type': 'choice', '_value': [8, 16, 32]}, 'res_f3': {'_type': 'choice', '_value': [8, 16, 32]}, 'res_k1': {'_type': 'choice', '_value': [3, 5]}, 'res_k2': {'_type': 'choice', '_value': [3, 5]}, 'res_k3': {'_type': 'choice', '_value': [3, 5]}, 'res_drop1': {'_type': 'uniform', '_value': [0.1, 0.3]}, 'res_drop2': {'_type': 'uniform', '_value': [0.1, 0.3]}, 'res_drop3': {'_type': 'uniform', '_value': [0.1, 0.3]}, 'bilstm': {'_type': 'choice', '_value': [1, 2]}, 'u1': {'_type': 'choice', '_value': [8, 16]}, 'u2': {'_type': 'choice', '_value': [8, 16]}, 'drop': {'_type': 'uniform', '_value': [0.1, 0.3]}, 'pu': {'_type': 'choice', '_value': [8, 16]}, 'su': {'_type': 'choice', '_value': [8, 16]}, 'batch_size': {'_type': 'choice', '_value': [50, 80, 100]}, 'epochs': {'_type': 'choice', '_value': [10, 15, 20, 25, 30]}} [2023-05-04 13:05:14] ERROR (nni.runtime.command_channel.websocket.channel/MainThread) Failed to receive command. Retry in 0s Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/protocol.py", line 968, in transfer_data message = await self.read_message() File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/protocol.py", line 1038, in read_message frame = await self.read_data_frame(max_size=self.max_size) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/protocol.py", line 1113, in read_data_frame frame = await self.read_frame(max_size) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/protocol.py", line 1170, in read_frame frame = await Frame.read( File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/framing.py", line 69, in read data = await reader(2) File "/usr/lib/python3.10/asyncio/streams.py", line 708, in readexactly await self._wait_for_data('readexactly') File "/usr/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data await self._waiter asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 99, in _receive_command command = conn.receive() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 103, in receive msg = _wait(self._ws.recv()) File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 121, in _wait return future.result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result return self.get_result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result raise self._exception File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/protocol.py", line 568, in recv await self.ensure_open() File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/protocol.py", line 953, in ensure_open raise self.connection_closed_exc() websockets.exceptions.ConnectionClosedError: sent 1011 (unexpected error) keepalive ping timeout; no close frame received [2023-05-04 13:05:34] ERROR (nni.runtime.command_channel.websocket.channel/MainThread) Failed to receive command. Retry in 1s Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 666, in __await_impl__ await protocol.handshake( File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 326, in handshake status_code, response_headers = await self.read_http_response() File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 138, in read_http_response status_code, reason, headers = await read_response(self.reader) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 120, in read_response status_line = await read_line(stream) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 194, in read_line line = await stream.readline() File "/usr/lib/python3.10/asyncio/streams.py", line 524, in readline line = await self.readuntil(sep) File "/usr/lib/python3.10/asyncio/streams.py", line 616, in readuntil await self._wait_for_data('readuntil') File "/usr/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data await self._waiter asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 98, in _receive_command conn = self._ensure_conn() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 75, in _ensure_conn self._conn.connect() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 65, in connect self._ws = _wait(_connect_async(self._url)) File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 121, in _wait return future.result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result return self.get_result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result raise self._exception File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 135, in _connect_async return await websockets.connect(url, max_size=None) # type: ignore File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 659, in await_impl_timeout return await asyncio.wait_for(self.await_impl__(), self.open_timeout) File "/usr/lib/python3.10/asyncio/tasks.py", line 458, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError [2023-05-04 13:05:55] ERROR (nni.runtime.command_channel.websocket.channel/MainThread) Failed to receive command. Retry in 2s Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 666, in await_impl__ await protocol.handshake( File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 326, in handshake status_code, response_headers = await self.read_http_response() File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 138, in read_http_response status_code, reason, headers = await read_response(self.reader) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 120, in read_response status_line = await read_line(stream) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 194, in read_line line = await stream.readline() File "/usr/lib/python3.10/asyncio/streams.py", line 524, in readline line = await self.readuntil(sep) File "/usr/lib/python3.10/asyncio/streams.py", line 616, in readuntil await self._wait_for_data('readuntil') File "/usr/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data await self._waiter asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 98, in _receive_command conn = self._ensure_conn() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 75, in _ensure_conn self._conn.connect() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 65, in connect self._ws = _wait(_connect_async(self._url)) File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 121, in _wait return future.result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result return self.get_result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result raise self._exception File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 135, in _connect_async return await websockets.connect(url, max_size=None) # type: ignore File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 659, in await_impl_timeout return await asyncio.wait_for(self.await_impl__(), self.open_timeout) File "/usr/lib/python3.10/asyncio/tasks.py", line 458, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError [2023-05-04 13:06:17] ERROR (nni.runtime.command_channel.websocket.channel/MainThread) Failed to receive command. Retry in 3s Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 666, in await_impl__ await protocol.handshake( File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 326, in handshake status_code, response_headers = await self.read_http_response() File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 138, in read_http_response status_code, reason, headers = await read_response(self.reader) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 120, in read_response status_line = await read_line(stream) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 194, in read_line line = await stream.readline() File "/usr/lib/python3.10/asyncio/streams.py", line 524, in readline line = await self.readuntil(sep) File "/usr/lib/python3.10/asyncio/streams.py", line 616, in readuntil await self._wait_for_data('readuntil') File "/usr/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data await self._waiter asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 98, in _receive_command conn = self._ensure_conn() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 75, in _ensure_conn self._conn.connect() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 65, in connect self._ws = _wait(_connect_async(self._url)) File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 121, in _wait return future.result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result return self.get_result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result raise self._exception File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 135, in _connect_async return await websockets.connect(url, max_size=None) # type: ignore File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 659, in await_impl_timeout return await asyncio.wait_for(self.await_impl__(), self.open_timeout) File "/usr/lib/python3.10/asyncio/tasks.py", line 458, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError [2023-05-04 13:06:40] ERROR (nni.runtime.command_channel.websocket.channel/MainThread) Failed to receive command. Retry in 4s Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 666, in await_impl__ await protocol.handshake( File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 326, in handshake status_code, response_headers = await self.read_http_response() File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 138, in read_http_response status_code, reason, headers = await read_response(self.reader) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 120, in read_response status_line = await read_line(stream) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 194, in read_line line = await stream.readline() File "/usr/lib/python3.10/asyncio/streams.py", line 524, in readline line = await self.readuntil(sep) File "/usr/lib/python3.10/asyncio/streams.py", line 616, in readuntil await self._wait_for_data('readuntil') File "/usr/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data await self._waiter asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 98, in _receive_command conn = self._ensure_conn() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 75, in _ensure_conn self._conn.connect() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 65, in connect self._ws = _wait(_connect_async(self._url)) File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 121, in _wait return future.result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result return self.get_result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result raise self._exception File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 135, in _connect_async return await websockets.connect(url, max_size=None) # type: ignore File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 659, in await_impl_timeout return await asyncio.wait_for(self.await_impl__(), self.open_timeout) File "/usr/lib/python3.10/asyncio/tasks.py", line 458, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError [2023-05-04 13:06:44] WARNING (nni.runtime.command_channel.websocket.channel/MainThread) Failed to receive command. Last retry [2023-05-04 13:07:04] INFO (nni.runtime.msg_dispatcher_base/MainThread) Report error to NNI manager: Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 666, in await_impl__ await protocol.handshake( File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 326, in handshake status_code, response_headers = await self.read_http_response() File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 138, in read_http_response status_code, reason, headers = await read_response(self.reader) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 120, in read_response status_line = await read_line(stream) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 194, in read_line line = await stream.readline() File "/usr/lib/python3.10/asyncio/streams.py", line 524, in readline line = await self.readuntil(sep) File "/usr/lib/python3.10/asyncio/streams.py", line 616, in readuntil await self._wait_for_data('readuntil') File "/usr/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data await self._waiter asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/nni/main.py", line 61, in main dispatcher.run() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/msg_dispatcher_base.py", line 69, in run command, data = self._channel._receive() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/tuner_command_channel/channel.py", line 270, in _receive command = self._channel.receive() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 59, in receive command = self._receive_command() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 108, in _receive_command conn = self._ensure_conn() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 75, in _ensure_conn self._conn.connect() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 65, in connect self._ws = _wait(_connect_async(self._url)) File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 121, in _wait return future.result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result return self.get_result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result raise self._exception File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 135, in _connect_async return await websockets.connect(url, max_size=None) # type: ignore File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 659, in await_impl_timeout return await asyncio.wait_for(self.__await_impl__(), self.open_timeout) File "/usr/lib/python3.10/asyncio/tasks.py", line 458, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError
[2023-05-04 13:07:04] ERROR (nni.runtime.command_channel.websocket.channel/MainThread) Failed to send command. Retry in 0s Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 666, in __await_impl__ await protocol.handshake( File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 326, in handshake status_code, response_headers = await self.read_http_response() File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 138, in read_http_response status_code, reason, headers = await read_response(self.reader) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 120, in read_response status_line = await read_line(stream) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 194, in read_line line = await stream.readline() File "/usr/lib/python3.10/asyncio/streams.py", line 524, in readline line = await self.readuntil(sep) File "/usr/lib/python3.10/asyncio/streams.py", line 616, in readuntil await self._wait_for_data('readuntil') File "/usr/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data await self._waiter asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/nni/main.py", line 61, in main dispatcher.run() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/msg_dispatcher_base.py", line 69, in run command, data = self._channel._receive() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/tuner_command_channel/channel.py", line 270, in _receive command = self._channel.receive() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 59, in receive command = self._receive_command() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 108, in _receive_command conn = self._ensure_conn() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 75, in _ensure_conn self._conn.connect() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 65, in connect self._ws = _wait(_connect_async(self._url)) File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 121, in _wait return future.result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result return self.get_result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result raise self._exception File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 135, in _connect_async return await websockets.connect(url, max_size=None) # type: ignore File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 659, in await_impl_timeout return await asyncio.wait_for(self.__await_impl__(), self.open_timeout) File "/usr/lib/python3.10/asyncio/tasks.py", line 458, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 45, in send conn.send(command) File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 90, in send _wait(self._ws.send(nni.dump(message))) AttributeError: 'NoneType' object has no attribute 'send' [2023-05-04 13:07:24] ERROR (nni.runtime.command_channel.websocket.channel/MainThread) Failed to send command. Retry in 1s Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 666, in __await_impl__ await protocol.handshake( File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 326, in handshake status_code, response_headers = await self.read_http_response() File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 138, in read_http_response status_code, reason, headers = await read_response(self.reader) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 120, in read_response status_line = await read_line(stream) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 194, in read_line line = await stream.readline() File "/usr/lib/python3.10/asyncio/streams.py", line 524, in readline line = await self.readuntil(sep) File "/usr/lib/python3.10/asyncio/streams.py", line 616, in readuntil await self._wait_for_data('readuntil') File "/usr/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data await self._waiter asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 44, in send conn = self._ensure_conn() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 75, in _ensure_conn self._conn.connect() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 65, in connect self._ws = _wait(_connect_async(self._url)) File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 121, in _wait return future.result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result return self.get_result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result raise self._exception File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 135, in _connect_async return await websockets.connect(url, max_size=None) # type: ignore File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 659, in await_impl_timeout return await asyncio.wait_for(self.await_impl__(), self.open_timeout) File "/usr/lib/python3.10/asyncio/tasks.py", line 458, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError [2023-05-04 13:07:46] ERROR (nni.runtime.command_channel.websocket.channel/MainThread) Failed to send command. Retry in 2s Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 666, in await_impl__ await protocol.handshake( File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 326, in handshake status_code, response_headers = await self.read_http_response() File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 138, in read_http_response status_code, reason, headers = await read_response(self.reader) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 120, in read_response status_line = await read_line(stream) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 194, in read_line line = await stream.readline() File "/usr/lib/python3.10/asyncio/streams.py", line 524, in readline line = await self.readuntil(sep) File "/usr/lib/python3.10/asyncio/streams.py", line 616, in readuntil await self._wait_for_data('readuntil') File "/usr/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data await self._waiter asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 44, in send conn = self._ensure_conn() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 75, in _ensure_conn self._conn.connect() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 65, in connect self._ws = _wait(_connect_async(self._url)) File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 121, in _wait return future.result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result return self.get_result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result raise self._exception File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 135, in _connect_async return await websockets.connect(url, max_size=None) # type: ignore File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 659, in await_impl_timeout return await asyncio.wait_for(self.await_impl__(), self.open_timeout) File "/usr/lib/python3.10/asyncio/tasks.py", line 458, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError [2023-05-04 13:08:08] ERROR (nni.runtime.command_channel.websocket.channel/MainThread) Failed to send command. Retry in 3s Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 666, in await_impl__ await protocol.handshake( File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 326, in handshake status_code, response_headers = await self.read_http_response() File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 138, in read_http_response status_code, reason, headers = await read_response(self.reader) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 120, in read_response status_line = await read_line(stream) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 194, in read_line line = await stream.readline() File "/usr/lib/python3.10/asyncio/streams.py", line 524, in readline line = await self.readuntil(sep) File "/usr/lib/python3.10/asyncio/streams.py", line 616, in readuntil await self._wait_for_data('readuntil') File "/usr/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data await self._waiter asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 44, in send conn = self._ensure_conn() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 75, in _ensure_conn self._conn.connect() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 65, in connect self._ws = _wait(_connect_async(self._url)) File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 121, in _wait return future.result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result return self.get_result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result raise self._exception File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 135, in _connect_async return await websockets.connect(url, max_size=None) # type: ignore File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 659, in await_impl_timeout return await asyncio.wait_for(self.await_impl__(), self.open_timeout) File "/usr/lib/python3.10/asyncio/tasks.py", line 458, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError [2023-05-04 13:08:31] ERROR (nni.runtime.command_channel.websocket.channel/MainThread) Failed to send command. Retry in 4s Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 666, in await_impl__ await protocol.handshake( File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 326, in handshake status_code, response_headers = await self.read_http_response() File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 138, in read_http_response status_code, reason, headers = await read_response(self.reader) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 120, in read_response status_line = await read_line(stream) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 194, in read_line line = await stream.readline() File "/usr/lib/python3.10/asyncio/streams.py", line 524, in readline line = await self.readuntil(sep) File "/usr/lib/python3.10/asyncio/streams.py", line 616, in readuntil await self._wait_for_data('readuntil') File "/usr/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data await self._waiter asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 44, in send conn = self._ensure_conn() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 75, in _ensure_conn self._conn.connect() File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 65, in connect self._ws = _wait(_connect_async(self._url)) File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 121, in _wait return future.result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result return self.get_result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result raise self._exception File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 135, in _connect_async return await websockets.connect(url, max_size=None) # type: ignore File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 659, in await_impl_timeout return await asyncio.wait_for(self.await_impl__(), self.open_timeout) File "/usr/lib/python3.10/asyncio/tasks.py", line 458, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError [2023-05-04 13:08:35] WARNING (nni.runtime.command_channel.websocket.channel/MainThread) Failed to send command {'type': 'ER', 'content': 'Traceback (most recent call last):\n File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 666, in await_impl\n await protocol.handshake(\n File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 326, in handshake\n status_code, response_headers = await self.read_http_response()\n File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 138, in read_http_response\n status_code, reason, headers = await read_response(self.reader)\n File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 120, in read_response\n status_line = await read_line(stream)\n File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 194, in read_line\n line = await stream.readline()\n File "/usr/lib/python3.10/asyncio/streams.py", line 524, in readline\n line = await self.readuntil(sep)\n File "/usr/lib/python3.10/asyncio/streams.py", line 616, in readuntil\n await self._wait_for_data(\'readuntil\')\n File "/usr/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data\n await self._waiter\nasyncio.exceptions.CancelledError\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File "/usr/lib/python3.10/asyncio/tasks.py", line 456, in wait_for\n return fut.result()\nasyncio.exceptions.CancelledError\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n File "/usr/local/lib/python3.10/dist-packages/nni/main.py", line 61, in main\n dispatcher.run()\n File "/usr/local/lib/python3.10/dist-packages/nni/runtime/msg_dispatcher_base.py", line 69, in run\n command, data = self._channel._receive()\n File "/usr/local/lib/python3.10/dist-packages/nni/runtime/tuner_command_channel/channel.py", line 270, in _receive\n command = self._channel.receive()\n File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 59, in receive\n command = self._receive_command()\n File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 108, in _receive_command\n conn = self._ensure_conn()\n File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 75, in _ensure_conn\n self._conn.connect()\n File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 65, in connect\n self._ws = _wait(_connect_async(self._url))\n File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 121, in _wait\n return future.result()\n File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result\n return self.get_result()\n File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result\n raise self._exception\n File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 135, in _connect_async\n return await websockets.connect(url, max_size=None) # type: ignore\n File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 659, in await_impl_timeout\n return await asyncio.wait_for(self.await_impl(), self.open_timeout)\n File "/usr/lib/python3.10/asyncio/tasks.py", line 458, in wait_for\n raise exceptions.TimeoutError() from exc\nasyncio.exceptions.TimeoutError\n'}. Last retry [2023-05-04 13:08:55] ERROR (nni.runtime.msg_dispatcher_base/MainThread) Connection to NNI manager is broken. Failed to report error. [2023-05-04 13:08:55] ERROR (nni.main/MainThread) Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 666, in await_impl__ await protocol.handshake( File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 326, in handshake status_code, response_headers = await self.read_http_response() File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 138, in read_http_response status_code, reason, headers = await read_response(self.reader) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 120, in read_response status_line = await read_line(stream) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 194, in read_line line = await stream.readline() File "/usr/lib/python3.10/asyncio/streams.py", line 524, in readline line = await self.readuntil(sep) File "/usr/lib/python3.10/asyncio/streams.py", line 616, in readuntil await self._wait_for_data('readuntil') File "/usr/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data await self._waiter asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/nni/main.py", line 85, in
main()
File "/usr/local/lib/python3.10/dist-packages/nni/main.py", line 61, in main
dispatcher.run()
File "/usr/local/lib/python3.10/dist-packages/nni/runtime/msg_dispatcher_base.py", line 69, in run
command, data = self._channel._receive()
File "/usr/local/lib/python3.10/dist-packages/nni/runtime/tuner_command_channel/channel.py", line 270, in _receive
command = self._channel.receive()
File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 59, in receive
command = self._receive_command()
File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 108, in _receive_command
conn = self._ensure_conn()
File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 75, in _ensure_conn
self._conn.connect()
File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 65, in connect
self._ws = _wait(_connect_async(self._url))
File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 121, in _wait
return future.result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.get_result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result
raise self._exception
File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 135, in _connect_async
return await websockets.connect(url, max_size=None) # type: ignore
File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 659, in await_impl_timeout
return await asyncio.wait_for(self.__await_impl__(), self.open_timeout)
File "/usr/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
Experiment yajeqwud start: 2023-05-04 13:04:12.999020
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 666, in __await_impl__ await protocol.handshake( File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 326, in handshake status_code, response_headers = await self.read_http_response() File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 138, in read_http_response status_code, reason, headers = await read_response(self.reader) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 120, in read_response status_line = await read_line(stream) File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/http.py", line 194, in read_line line = await stream.readline() File "/usr/lib/python3.10/asyncio/streams.py", line 524, in readline line = await self.readuntil(sep) File "/usr/lib/python3.10/asyncio/streams.py", line 616, in readuntil await self._wait_for_data('readuntil') File "/usr/lib/python3.10/asyncio/streams.py", line 501, in _wait_for_data await self._waiter asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/lib/python3.10/asyncio/tasks.py", line 456, in wait_for return fut.result() asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.10/dist-packages/nni/main.py", line 85, in
main()
File "/usr/local/lib/python3.10/dist-packages/nni/main.py", line 61, in main
dispatcher.run()
File "/usr/local/lib/python3.10/dist-packages/nni/runtime/msg_dispatcher_base.py", line 69, in run
command, data = self._channel._receive()
File "/usr/local/lib/python3.10/dist-packages/nni/runtime/tuner_command_channel/channel.py", line 270, in _receive
command = self._channel.receive()
File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 59, in receive
command = self._receive_command()
File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 108, in _receive_command
conn = self._ensure_conn()
File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/channel.py", line 75, in _ensure_conn
self._conn.connect()
File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 65, in connect
self._ws = _wait(_connect_async(self._url))
File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 121, in _wait
return future.result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.get_result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result
raise self._exception
File "/usr/local/lib/python3.10/dist-packages/nni/runtime/command_channel/websocket/connection.py", line 135, in _connect_async
return await websockets.connect(url, max_size=None) # type: ignore
File "/usr/local/lib/python3.10/dist-packages/websockets/legacy/client.py", line 659, in await_impl_timeout
return await asyncio.wait_for(self.__await_impl__(), self.open_timeout)
File "/usr/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
How to reproduce it?: