Closed zlion closed 6 months ago
The unresolved issues mentioned earlier do not appear in the latest tests. This could be attributed to changes in the code or the testing environment. Therefore, I propose closing this issue ticket for the time being.
2024-01-06 16:12:04,525 [controller.py:171] [ERROR] 400 POST https://compute.googleapis.com/compute/v1/projects/fabfed/regions/us-east4/routers: Exceeded maximum allowed routers in same network and region: 5.
Solution: Use extra VPCs to host more than 5 routers.
Creating VLAN attachments failed. Error: Operation type [insert] failed with message "Quota 'INTERCONNECT_ATTACHMENTS_PER_REGION' exceeded. Limit: 16.0 in region us-east4."
2024-01-15 14:42:57,710 [controller.py:171] [ERROR] An error occurred (VpnGatewayLimitExceeded) when calling the CreateVpnGateway operation: The maximum number of virtual private gateways has been reached.
More details:
Solution: pending
Solution: need to reproduce
Solution: Increase the time out to 24*360? in the fabfed code.
Solution: refer to 7
Solution: Komal finds out it is issue with VMs on MAX site
Your VM was created on MAX, libvirt logs suggest that the disk for the VM failed and hence openstack terminated. Unfortunately, openstack doesn't return this information back to the am so the error message is not clear.
You should have access to the Site Head Nodes in this case [max-hn.fabric-testbed.net](http://max-hn.fabric-testbed.net/) and then looking for logs on the specific worker to debug. If you don't have access to head nodes, please check with Mert/Hussam.
This is related to 5 or 6 above. The FabFed reported an issue without details. The log on AL2Sam server shows more details as following.
2024-01-09 22:50:33,177 - al2s-am-handler - {ansible_helper.py:145} - [TickEvent-89]- INFO - {"changed": false, "elapsed": 30, "msg": "Status code was -1 and not [200]: Connection failure: The read operation timed out", "redirected": false, "status": -1, "url": "https://api.ns.internet2.edu/v1/footprint/cloudconnect"}
More log details are as follows.
Solution: I2 pushed a fix and it works fine so far
A problem I encountered in creating muiltiple slices simutaneously. The portal shows aws-14 stableOK but aws-5 stableError. But the log on al2sam server shows both slices created ok. That problem appears randomly over multiple tries.
Refer to 6