tailscale / tailscale

The easiest, most secure way to use WireGuard and 2FA.
https://tailscale.com
BSD 3-Clause "New" or "Revised" License
19.4k stars 1.52k forks source link

Build a coordination server status page #1553

Closed rosszurowski closed 2 years ago

rosszurowski commented 3 years ago

A few users have written in asking for a status page indicating whether the coordination server is online or having issues. We've delayed on this, as Tailscale is designed to function even without the control server, so rare outages already have minimal impact. The main restriction is you can't add new devices to your network while the control server is down.

There's still value in having a status page: it's a chance to communicate our resiliency to users, can reassure them that the control server isn't down while they're debugging networking problems, and gives users a place to look when there is an outage.

Our status page will need to distinguish between:

mnaser commented 3 years ago

Question: wouldn't a relay server outage/issues result in client-facing problems that they might struggle to debug?

DentonGentry commented 3 years ago

For the relay servers: if a relay server is down the clients relatively quickly fail away to the next closest relay. As an example, the Dallas/Fort Worth relays were down for several hours a few weeks ago while the hosting provider they run in upgraded their power distribution. Clients shifted to Chicago and other nearby relays.

benz2012 commented 2 years ago

+1 for this status page / communication stream (and +1 from my Team) as the sanity check would be much appreciated when we see authentication issues like the ones caused by https://github.com/tailscale/tailscale/issues/4168

We force our users to re-authenticate daily, which, from what we've heard, makes us an outlier. Unfortunately, that also means we're heavily affected by these outages.

Also, because of the nature of our business, we frequently add new devices to the network, which would also be affected by the outages described in this issue's description.

andrew-d commented 2 years ago

We now have a status page at https://status.tailscale.com which tracks the current status of the main Tailscale components and has a section for ongoing and recent incidents.