Closed Strecoza33 closed 1 year ago
Networking outage in a Hetzner-owned network switch at 2023-04-02, 02:24 AM UTC. https://status.hetzner.com/incident/955f29d2-cc92-44a8-b168-f59a64091c26 The Hetzner network failure was resolved at 2023-04-02, 03:05 AM UTC, and the IT infrastructure returned to normal.
This network failure (correctly) caused automated migration of the two affected validators to spare capacity in a different datacenter at roughly the same time the network outage started (2023-04-02, 02:24 AM UTC)
As a result, block creation paused and the validators that were migrated onto spare systems needed to catch up with the rest of the network. Catch-up was completed at 2023-04-02, 04:18 AM UTC and block creation resumed.
No manual intervention was required
@Strecoza33 Please review this report for accuracy and completeness before closing
Revdefine shows block creation stopped at approx 10:40pm ET. Last block created was 1800639.
L3 Response - There has been a networking hardware failure in the hetzner datacenter. In response, Nomad has migrated two validators to a different hetzner datacenter. Those migrated validators are catching back up with the network by retrieving missing blocks. This is normal, expected recovery behavior from a network hardware failure It may be a couple of hours before the network is "whole" again.