Open Moreau14 opened 9 months ago
I ran into the same problem, ran several trials and got this error. I found out that it was due to running NNI in the screen
command.
I ran into the same problem, ran several trials and got this error. I found out that it was due to running NNI in the
screen
command.
Yes, you can't run NNI in the tmux
either.
dispacher log:
ERROR (nni.runtime.msg_dispatcher_base/Thread-1) 68 Traceback (most recent call last): File "/home/mm/anaconda3/envs/CLNode_env/lib/python3.8/site-packages/nni/runtime/msg_dispatcher_base.py", line 108, in command_queue_worker self.process_command(command, data) File "/home/mm/anaconda3/envs/CLNode_env/lib/python3.8/site-packages/nni/runtime/msg_dispatcher_base.py", line 154, in process_command command_handlerscommand File "/home/mm/anaconda3/envs/CLNode_env/lib/python3.8/site-packages/nni/runtime/msg_dispatcher.py", line 148, in handle_report_metric_data self._handle_final_metric_data(data) File "/home/mm/anaconda3/envs/CLNode_env/lib/python3.8/site-packages/nni/runtime/msg_dispatcher.py", line 201, in _handle_final_metric_data self.tuner.receive_trialresult(id, _trialparams[id], value, customized=customized, File "/home/mm/anaconda3/envs/CLNode_env/lib/python3.8/site-packages/nni/algorithms/hpo/tpe_tuner.py", line 197, in receive_trial_result params = self._running_params.pop(parameter_id) KeyError: 68
NNImanager log:
ERROR (WsChannel.default) Channel closed. Ignored command { type: 'GE', content: '1' } [2024-01-25 11:08:42] WARNING (WsConnection.default) Missing pong [2024-01-25 11:08:47] WARNING (WsConnection.default) Missing pong [2024-01-25 11:08:47] ERROR (WsConnection.default) Failed sending command. Drop connection: Error: WebSocket is not open: readyState 3 (CLOSED) at sendAfterClose (/home/mm/anaconda3/envs/CLNode_env/lib/python3.8/site-packages/nni_node/node_modules/express-ws/node_modules/ws/lib/websocket.js:988:17) at WebSocket.send (/home/mm/anaconda3/envs/CLNode_env/lib/python3.8/site-packages/nni_node/node_modules/express-ws/node_modules/ws/lib/websocket.js:405:7) at node:internal/util:375:7 at new Promise ()
at bound send (node:internal/util:361:12)
at WsConnection.sendAsync (/home/mm/anaconda3/envs/CLNode_env/lib/python3.8/site-packages/nni_node/common/command_channel/websocket/connection.js:92:16)
at WsConnection.heartbeat (/home/mm/anaconda3/envs/CLNode_env/lib/python3.8/site-packages/nni_node/common/command_channel/websocket/connection.js:144:18)
at listOnTimeout (node:internal/timers:569:17)
at process.processTimers (node:internal/timers:512:7)