fabric-testbed / fabfed

FABRIC Tool-based Federation Kit for a Testbed of Testbeds
MIT License
2 stars 0 forks source link

FabricNode Connection Timed Out #106

Closed zlion closed 4 months ago

zlion commented 6 months ago

Following the slice state became StableOK, a connection attempt to the fabric node is tried but fails.

A related issue was found in "concurrency Issues"

#

2024-03-07 14:48:24,187 [fabric_slice_helper.py:129] [INFO] Found slice native-aws-lzhang9-1:state=StableOK
2024-03-07 14:48:24,223 [fabric_provider.py:101] [INFO] Done initializing slice native-aws-lzhang9-1
2024-03-07 14:48:24,231 [fabric_node.py:17] [INFO]  Node fabric_node0 construtor called ... 
2024-03-07 15:04:44,479 [fabric_node.py:51] [INFO]  Node fabric_node0 has stitch device=None
2024-03-07 15:08:49,534 [controller.py:248] [ERROR] [Errno 60] Operation timed out
Traceback (most recent call last):
  File "/Users/lzhang9/Projects/fabric-testbed/fabfed/fabfed/controller/controller.py", line 245, in add
    provider.add_resource(resource=resource.attributes)
  File "/Users/lzhang9/Projects/fabric-testbed/fabfed/fabfed/provider/api/provider.py", line 200, in add_resource
    raise e
  File "/Users/lzhang9/Projects/fabric-testbed/fabfed/fabfed/provider/api/provider.py", line 194, in add_resource
    self.do_add_resource(resource=resource)
  File "/Users/lzhang9/Projects/fabric-testbed/fabfed/fabfed/provider/fabric/fabric_provider.py", line 116, in do_add_resource
    self.slice.add_resource(resource=resource)
  File "/Users/lzhang9/Projects/fabric-testbed/fabfed/fabfed/provider/fabric/fabric_slice.py", line 159, in add_resource
    self._add_node(resource)
  File "/Users/lzhang9/Projects/fabric-testbed/fabfed/fabfed/provider/fabric/fabric_slice.py", line 125, in _add_node
    node = FabricNode(label=label, delegate=delegate, network_label="")
  File "/Users/lzhang9/Projects/fabric-testbed/fabfed/fabfed/provider/fabric/fabric_node.py", line 65, in __init__
    for ip_addr in self._delegate.ip_addr_list(output='json', update=False):
  File "/Users/lzhang9/opt/anaconda3/envs/fabfed/lib/python3.9/site-packages/fabrictestbed_extensions/fablib/node.py", line 2260, in ip_addr_list
    raise e
  File "/Users/lzhang9/opt/anaconda3/envs/fabfed/lib/python3.9/site-packages/fabrictestbed_extensions/fablib/node.py", line 2252, in ip_addr_list
    stdout, stderr = self.execute(f"sudo  ip -j addr list", quiet=True)
  File "/Users/lzhang9/opt/anaconda3/envs/fabfed/lib/python3.9/site-packages/fabrictestbed_extensions/fablib/node.py", line 1564, in execute
    raise e
  File "/Users/lzhang9/opt/anaconda3/envs/fabfed/lib/python3.9/site-packages/fabrictestbed_extensions/fablib/node.py", line 1417, in execute
    bastion.connect(
  File "/Users/lzhang9/opt/anaconda3/envs/fabfed/lib/python3.9/site-packages/paramiko/client.py", line 365, in connect
    sock.connect(addr)
TimeoutError: [Errno 60] Operation timed out
zlion commented 6 months ago

Conducted the same tests several hours later, yet the problem did not reappear. We will monitor it in future tests.