Closed Erick-Reyes closed 1 year ago
If using systemd, the signal sent to the Teleport process on restart is -HUP
.
Forks a new Teleport daemon to serve new connections and initiates the graceful shutdown of the existing process when there are no more clients connected to it.
Ref: https://goteleport.com/docs/reference/signals/
If performing an upgrade of Teleport, while the Proxy is still in rotation and has not been fully drained of connections, the process may hang until all clients disconnect.
If the load balancer attempts to send connections to this Proxy, the following error message will be seen in the Web UI upon authentication.
"Internal error - rpc error: code = Canceled desc = grpc: the client connection is closing"
The proper way to upgrade is to remove the Proxy from the load balancer, drain all connections, and then upgrade the Proxy instance.
Talked with @zmb3 about this. We're thinking about handling this in two ways.
/healthz
and /readyz
for inform the LB that connections should not be forwarded to this proxy anymore.@travelton Because this is a nice to have, we're not scheduling this right now. We'll keep it in mind as a good starter issue for the future.
@russjones @zmb3 customer checked/verified the systemd services and states all the processes/services are fine, they are not attempting to restart.
A little more information. We originally upgraded from 9.x -> 10.x -> 11.x. The issue started happening in 10.x and 11.x. The proxy and auth servers are all new instances. The upgrade process is as followed:
The autoscaling group removes and terminates the old instance first before launching the new instance.
Is it possible to know what versions will include fix #23691?
@kelcya the next v10, v11, and v12 release will contain the fix. That will be v12.1.3, v11.3.10, and v10.3.15
Expected behavior: Log in to your Teleport cluster thru the UI without any issues (Okta SAML).
Current behavior: When you log in to the teleport cluster via WebUI (Okta SAML), it sometimes greets you with the error below:
"Internal error - rpc error: code = Canceled desc = grpc: the client connection is closing"
The error goes away when you refresh the page.Bug details:
Debug logs: Auth Server logs:
Okta-SAML config:
Screenshot of developer tool error on a check:
We also have a HAR file upon request.
Extras: