Scalingo / link

LinK is not Keepalived - Virtual IP manager backed by etcd
MIT License
39 stars 5 forks source link

[Healthcheck] Protocol dependant checks #61

Open johnsudaar opened 5 years ago

johnsudaar commented 5 years ago

Health check should be protocol dependant.

If we're using LinK for a distributed PGSQL we should have a PGSQL health check. TCP check is not enough (especially if it's proxied by another service (haproxy, nginx)).

Soulou commented 5 years ago

I disagree here, LINK role is not to ping the backend, but to check that haproxy is present. If the backend is down, it means that the IP will completely disappear, no one will use it, and apps will get 'no route to host'

That's not what we want. We want to reach one HA proxy (having the IP) which will keep the connection for up to 60 seconds (default configuration) and forward them to the backend once it's up again.

johnsudaar commented 5 years ago

LinK goal is to manage an IP and failover if its backend is not able to perform its work.

If there's not backend available the current host is not healthy and should not take the IP.

Soulou commented 5 years ago

But it's not the behavior we want, you're going to drop more connections and create more unavailability doing this.

Soulou commented 5 years ago

And in our use case, the backend is an HAProxy instance, not what HAProxy is proxying to, this has to be monitored differently.

johnsudaar commented 5 years ago

We can add a TimeBeforeFail on the healthcheck if that's what worrying you (and set it to 70s on PGSQL).

Soulou commented 5 years ago

Not only PGSQL, just all of them

Soulou commented 5 years ago

But still it's not a TimeBeforeFail that we want actually.

  1. DB fails
  2. 50 seconds laters user restart its app, new connections arrive
  3. I want them to reach haproxy which will keep them 60 seconds and retry the backend during this period

-> So it's not even 70s, it's just to have something handling the connections.

johnsudaar commented 5 years ago

Okay and if there is a SAND issue on the host and on one host the connection between HAProxy and the nodes is broken we will never failover even if we've a prefectly healthy host on the cluster.

johnsudaar commented 5 years ago

The timeout and crash thing is an edgecase and if the DB is down for more than 60s the user will see errors nevertheless. What's the point of routing IP to a non functionning server ? I get the master failover part where if it's down for less than 60s we do not want to failover. But in your precedent example the user will still see errors.

Soulou commented 5 years ago

To me it's much much less an edgecase than Linux failing the vxlan networking stack. (It's not SAND, SAND is just doing the setup, nothing else then)

I don't see the user won't see the errors if the down is long, but if it's a transient 30 seconds errors related to load, it would be almost transparent (except the delay itself) and then the connections would be transmitted to the backend.

We would route to a fully working IP, which is accepting connection and waiting for a backend to be ready. (all proxies would be in that state) I prefer having clients risking their chance getting a connection and waiting 60 seconds, than dropping all the packets by default.

johnsudaar commented 5 years ago

But after 60s you would drop the packet nevertheless. If it's a 30s errors related to load we wont failover because 30s is less than the 60s configured.

The IP is not fully working if there's not backend behind it.

johnsudaar commented 5 years ago

Plus it's not only the vxlan networking stack. It could be also networking between the host running the HAProxy and the host running the current master pgsql db.

Soulou commented 5 years ago

Yes we would drop the packet after 60 seconds, but it would have been retried during those 60 seconds, increasing the chances that it gets through the backend compared if we would have dropped it at once.

if it's a 180s incident related to load, that 10 connections arrive at 50s and 10 connections are created at 160s, 10 connections would go through, you would drop the 20 of them

I think if the networking falls between hosts of a virtual infrastructure, this would be the least of our problems...

EtienneM commented 5 years ago

I think that LinK health checks should only be about HAProxy and not what is behind for all the reasons @Soulou said. But @johnsudaar got a point: if there is a network issue between HAProxy and the backend, the LinK IP should go to another HAProxy.

For that, we could add another port on HAProxy with a new service running which is a health check service. LinK agent queries the HAProxy health check port. The service can, for instance, create a TCP connection to the backend in order to see if there is an issue there.

With this solution, LinK only health check for HAProxy healthiness, doesn't it?

Soulou commented 5 years ago

For that, we could add another port on HAProxy with a new service running which is a health check service. LinK agent query the HAProxy health check port. The service can, for instance, create a TCP connection to the backend in order to see if there is an issue there.

Not a TCP connection (it would get the same about reaching the backend and having weird log lines), but an ICMP ping for instance