At the beginning, the aim server was running properly. During the process, the server was killed. At the end, when close run,the following error was encountered and will be hang:
Exception in thread Thread-380:
Traceback (most recent call last):
File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/usr/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, *self._kwargs)
File "/usr/local/lib/python3.8/dist-packages/aim/ext/cleanup/init.py", line 87, in _cleanup
finalizer()
File "/usr/lib/python3.8/weakref.py", line 566, in call
return info.func(info.args, **(info.kwargs or {}))
File "/usr/local/lib/python3.8/dist-packages/aim/ext/transport/remote_resource.py", line 14, in _close
self.rpc_client.release_resource(self.handler)
File "/usr/local/lib/python3.8/dist-packages/aim/ext/transport/client.py", line 243, in release_resource
response = self.remote.release_resource(request, metadata=self._request_metadata)
File "/usr/local/lib/python3.8/dist-packages/grpc/_channel.py", line 946, in call
return _end_unary_response_blocking(state, call, False, None)
File "/usr/local/lib/python3.8/dist-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses; last error: UNKNOWN: Failed to connect to remote host: Connection refused"
debug_error_string = "UNKNOWN:Failed to pick subchannel {created_time:"2023-12-18T19:27:56.563282663+08:00", children:[UNKNOWN:failed to connect to all addresses; last error: UNKNOWN: Failed to connect to remote host: Connection refused {created_time:"2023-12-18T19:27:56.563248824+08:00", grpc_status:14}]}"
import aim
import time
run = aim.Run(repo="aim://****", log_system_params=True)
for i in range(100):
print("current id is ", i)
run["test"] = "test_" + str(i)
time.sleep(2)
print("start close")
run.close()
print("complete close")
🐛 Bug
At the beginning, the aim server was running properly. During the process, the server was killed. At the end, when close run,the following error was encountered and will be hang:
Exception in thread Thread-380: Traceback (most recent call last): File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run() File "/usr/lib/python3.8/threading.py", line 870, in run self._target(*self._args, *self._kwargs) File "/usr/local/lib/python3.8/dist-packages/aim/ext/cleanup/init.py", line 87, in _cleanup finalizer() File "/usr/lib/python3.8/weakref.py", line 566, in call return info.func(info.args, **(info.kwargs or {})) File "/usr/local/lib/python3.8/dist-packages/aim/ext/transport/remote_resource.py", line 14, in _close self.rpc_client.release_resource(self.handler) File "/usr/local/lib/python3.8/dist-packages/aim/ext/transport/client.py", line 243, in release_resource response = self.remote.release_resource(request, metadata=self._request_metadata) File "/usr/local/lib/python3.8/dist-packages/grpc/_channel.py", line 946, in call return _end_unary_response_blocking(state, call, False, None) File "/usr/local/lib/python3.8/dist-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking raise _InactiveRpcError(state) grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses; last error: UNKNOWN: Failed to connect to remote host: Connection refused" debug_error_string = "UNKNOWN:Failed to pick subchannel {created_time:"2023-12-18T19:27:56.563282663+08:00", children:[UNKNOWN:failed to connect to all addresses; last error: UNKNOWN: Failed to connect to remote host: Connection refused {created_time:"2023-12-18T19:27:56.563248824+08:00", grpc_status:14}]}"
import aim import time
run = aim.Run(repo="aim://****", log_system_params=True)
for i in range(100): print("current id is ", i) run["test"] = "test_" + str(i) time.sleep(2) print("start close") run.close() print("complete close")
Environment
Additional context