Open polarina opened 1 year ago
Hmmm... on 14.1.1 Teleport seems to recover for me.
2023-10-25T13:25:49-06:00 INFO [PROXY:SER] SSH proxy service 14.1.1:v14.1.1-0-gfb6429e is starting on [::]:3080 pid:2574.1 service/service.go:4305
2023-10-25T13:25:49-06:00 INFO [PROC:1] The new service has started successfully. Starting syncing rotation status with period 10m0s. pid:2574.1 service/connect.go:684
2023-10-25T13:25:49-06:00 INFO [PROXY:AGE] Starting reverse tunnel agent pool. service/service.go:4349
2023-10-25T13:25:49-06:00 INFO [PROXY:PRO] Starting Kube proxy on [::]:3080. service/service.go:4455
2023-10-25T13:25:49-06:00 INFO [DB:SERVIC] Starting Database Postgres proxy server on [::]:3080. pid:2574.1 service/service.go:4554
2023-10-25T13:25:49-06:00 INFO [DB:SERVIC] Starting Database TLS proxy server on 0.0.0.0:3080. pid:2574.1 service/service.go:4572
2023-10-25T13:25:49-06:00 INFO [PROC:1] Starting proxy gRPC server on [::]:3080. pid:2574.1 service/service.go:5937
2023-10-25T13:25:49-06:00 INFO [PROC:1] Starting proxy gRPC server on [::]:3080. pid:2574.1 service/service.go:6011
2023-10-25T13:25:49-06:00 INFO [PROXY:SER] Starting TLS ALPN SNI proxy server on [::]:3080. pid:2574.1 service/service.go:4649
2023-10-25T13:25:49-06:00 INFO [PROC:1] New service has started successfully. pid:2574.1 service/service.go:1258
2023-10-25T13:25:49-06:00 INFO [PROC:1] Received message from pid 2574: Process 2574 has started. path:/usr/local/bin/teleport pid:2557.1 workingDir:/ service/signals.go:541
2023-10-25T13:25:49-06:00 INFO [PROC:1] Child process signals success. path:/usr/local/bin/teleport pid:2574 workingDir:/ service/signals.go:549
2023-10-25T13:25:49-06:00 INFO [PROC:1] Successfully started new process, shutting down gracefully. pid:2557.1 service/signals.go:121
2023-10-25T13:25:49-06:00 WARN [DIAG:1] Diagnostic server exited with error: accept tcp [::1]:53080: use of closed network connection. pid:2557.1 service/service.go:3146
2023-10-25T13:25:49-06:00 WARN [PROC:1] Teleport process has exited with error. error:[
ERROR REPORT:
Original Error: *errors.errorString context canceled
Stack Trace:
github.com/gravitational/teleport/lib/auth/server_info.go:68 github.com/gravitational/teleport/lib/auth.(*Server).ReconcileServerInfos
github.com/gravitational/teleport/lib/service/service.go:2091 github.com/gravitational/teleport/lib/service.(*TeleportProcess).initAuthService.func7
github.com/gravitational/teleport/lib/service/supervisor.go:550 github.com/gravitational/teleport/lib/service.(*LocalService).Serve
github.com/gravitational/teleport/lib/service/supervisor.go:281 github.com/gravitational/teleport/lib/service.(*LocalSupervisor).serve.func1
runtime/asm_arm64.s:1197 runtime.goexit
User Message: context canceled] pid:2557.1 service:auth.server_info service/supervisor.go:286
2023-10-25T13:25:49-06:00 INFO [PROC:1] Signaled success to parent process. pid:2574.1 service/signals.go:211
2023-10-25T13:25:49-06:00 INFO [PROXY:SER] Shutting down gracefully. pid:2557.1 service/service.go:4721
2023-10-25T13:25:49-06:00 INFO [DIAG:1] Shutting down gracefully. pid:2557.1 service/service.go:3156
2023-10-25T13:25:49-06:00 INFO [DIAG:1] Exited. pid:2557.1 service/service.go:3160
2023-10-25T13:25:49-06:00 INFO [AUTH:1] Shutting down immediately (auth service does not currently support graceful shutdown). pid:2557.1 service/service.go:2110
2023-10-25T13:25:49-06:00 INFO [UPLOAD:1] File upload completer is shutting down. pid:2557.1 service/service.go:2945
2023-10-25T13:25:49-06:00 INFO [UPLOAD:1] File upload completer has shut down. pid:2557.1 service/service.go:2947
2023-10-25T13:25:49-06:00 INFO [UPLOAD:1] File uploader is shutting down. pid:2557.1 service/service.go:2913
2023-10-25T13:25:49-06:00 INFO [UPLOAD:1] File uploader has shut down. pid:2557.1 service/service.go:2915
2023-10-25T13:25:49-06:00 INFO [PROXY:SER] Exited. pid:2557.1 service/service.go:4109
2023-10-25T13:25:49-06:00 INFO [WEB] Closing session cache. web/sessions.go:692
2023-10-25T13:25:49-06:00 INFO [WEB] Closing session cache. web/sessions.go:692
2023-10-25T13:25:49-06:00 WARN [PROC:1] Teleport process has exited with error. error:[
ERROR REPORT:
Original Error: *trace.ConnectionProblemError listener is closed
Stack Trace:
github.com/gravitational/teleport/lib/srv/alpnproxy/listener.go:85 github.com/gravitational/teleport/lib/srv/alpnproxy.(*ListenerMuxWrapper).Accept
google.golang.org/grpc@v1.58.2/server.go:859 google.golang.org/grpc.(*Server).Serve
github.com/gravitational/teleport/lib/service/service.go:5938 github.com/gravitational/teleport/lib/service.(*TeleportProcess).initPublicGRPCServer.func1
github.com/gravitational/teleport/lib/service/supervisor.go:550 github.com/gravitational/teleport/lib/service.(*LocalService).Serve
github.com/gravitational/teleport/lib/service/supervisor.go:281 github.com/gravitational/teleport/lib/service.(*LocalSupervisor).serve.func1
runtime/asm_arm64.s:1197 runtime.goexit
User Message: listener is closed] pid:2557.1 service:proxy.grpc.public service/supervisor.go:286
2023-10-25T13:25:49-06:00 WARN [PROC:1] Teleport process has exited with error. error:[
ERROR REPORT:
Original Error: *trace.ConnectionProblemError listener is closed
Stack Trace:
github.com/gravitational/teleport/lib/srv/alpnproxy/listener.go:85 github.com/gravitational/teleport/lib/srv/alpnproxy.(*ListenerMuxWrapper).Accept
google.golang.org/grpc@v1.58.2/server.go:859 google.golang.org/grpc.(*Server).Serve
github.com/gravitational/teleport/lib/service/service.go:6012 github.com/gravitational/teleport/lib/service.(*TeleportProcess).initSecureGRPCServer.func1
github.com/gravitational/teleport/lib/service/supervisor.go:550 github.com/gravitational/teleport/lib/service.(*LocalService).Serve
github.com/gravitational/teleport/lib/service/supervisor.go:281 github.com/gravitational/teleport/lib/service.(*LocalSupervisor).serve.func1
runtime/asm_arm64.s:1197 runtime.goexit
User Message: listener is closed] pid:2557.1 service:proxy.grpc.secure service/supervisor.go:286
2023-10-25T13:25:49-06:00 INFO [PROXY:SER] Exited. pid:2557.1 service/service.go:4775
2023-10-25T13:25:49-06:00 INFO [AUTH:1] Exited. pid:2557.1 service/service.go:2125
2023-10-25T13:25:49-06:00 WARN [AUTH:1] Mux serve failed. error:[
ERROR REPORT:
Original Error: *trace.ConnectionProblemError listener is closed
Stack Trace:
github.com/gravitational/teleport/lib/multiplexer/wrappers.go:137 github.com/gravitational/teleport/lib/multiplexer.(*Listener).Accept
crypto/tls/tls.go:66 crypto/tls.(*listener).Accept
github.com/gravitational/teleport/lib/multiplexer/tls.go:110 github.com/gravitational/teleport/lib/multiplexer.(*TLSListener).Serve
github.com/gravitational/teleport/lib/auth/middleware.go:271 github.com/gravitational/teleport/lib/auth.(*TLSServer).Serve.func1
runtime/asm_arm64.s:1197 runtime.goexit
User Message: listener is closed
listener is closed] auth/middleware.go:272
2023-10-25T13:25:49-06:00 INFO [PROC:1] All services stopped, exiting. pid:2557.1 service/signals.go:123
2023-10-25T13:25:49-06:00 WARN [PROC:1] Teleport process has exited with error. error:[
ERROR REPORT:
Original Error: *trace.ConnectionProblemError error reading from server: EOF
Stack Trace:
github.com/gravitational/teleport/api@v0.0.0/client/client.go:820 github.com/gravitational/teleport/api/client.(*Client).Ping
github.com/gravitational/teleport/lib/service/awsoidc.go:56 github.com/gravitational/teleport/lib/service.(*TeleportProcess).initDeployServiceUpdater
github.com/gravitational/teleport/lib/service/supervisor.go:550 github.com/gravitational/teleport/lib/service.(*LocalService).Serve
github.com/gravitational/teleport/lib/service/supervisor.go:281 github.com/gravitational/teleport/lib/service.(*LocalSupervisor).serve.func1
runtime/asm_arm64.s:1197 runtime.goexit
User Message: error reading from server: EOF] pid:2574.1 service:update.aws-oidc.deploy.agents service/supervisor.go:286
2023-10-25T13:25:49-06:00 WARN [PROC:1] Non-critical service update.aws-oidc.deploy.agents has exited with error error reading from server: EOF, continuing to operate. pid:2574.1 service/signals.go:156
Not the cleanest restart. I temporarily see an unhealthy cache:
2023-10-25T13:25:49-06:00 WARN [PROXY:BEA] Heartbeat failed EOF. srv/heartbeat.go:282
2023-10-25T13:25:49-06:00 INFO [PROC:1] Detected Teleport component "proxy" is running in a degraded state. pid:2574.1 service/state.go:106
2023-10-25T13:25:49-06:00 WARN [PROXY:1:C] Re-init the cache on error error:[
ERROR REPORT:
Original Error: *trace.ConnectionProblemError watcher is closed: error reading from server: EOF
Stack Trace:
github.com/gravitational/teleport/lib/cache/cache.go:1185 github.com/gravitational/teleport/lib/cache.(*Cache).fetchAndWatch
github.com/gravitational/teleport/lib/cache/cache.go:983 github.com/gravitational/teleport/lib/cache.(*Cache).update
runtime/asm_arm64.s:1197 runtime.goexit
User Message: watcher is closed: error reading from server: EOF] cache/cache.go:989
2023-10-25T13:25:50-06:00 WARN [PROXY] Restart watch on error: error reading from server: EOF. pid:2574.1 resource-kind:[{lock false map[] {} [] 0}] services/watcher.go:259
2023-10-25T13:25:51-06:00 WARN [PROXY] Restart watch on error: watcher is closed: watcher closed. pid:2574.1 resource-kind:[{node false map[] {} [] 0}] services/watcher.go:259
2023-10-25T13:25:52-06:00 WARN [PROXY:1] Restart watch on error: watcher is closed: watcher closed. pid:2574.1 resource-kind:[{proxy false map[] {} [] 0}] services/watcher.go:259
2023-10-25T13:25:53-06:00 WARN [PROXY] Restart watch on error: watcher is closed: watcher closed. pid:2574.1 resource-kind:[{cert_authority false map[] {} [] 0}] services/watcher.go:259
But then things settle a few seconds later:
2023-10-25T13:25:54-06:00 INFO [PROC:1] Teleport component "proxy" is recovering from a degraded state. pid:2574.1 service/state.go:120
2023-10-25T13:25:55-06:00 WARN [PROXY:SER] Restart watch on error: watcher is closed: watcher closed. pid:2574.1 resource-kind:[{kube_server false map[] {} [] 0}] services/watcher.go:259
2023-10-25T13:26:04-06:00 INFO [PROC:1] Teleport component "proxy" has recovered from a degraded state. pid:2574.1 service/state.go:124
Issuing
systemctl reload teleport.service
causes the teleport service to crash and die.In addition, the process exits with a status code of 0, which fails to trigger the
Restart=on-failure
condition of the systemd service.The diagnostics server has to be enabled in
/etc/teleport.yml
for this issue to trigger:We run a separate backround process that issues requests http://[::1]:53080/healthz and http://[::1]:53080/readyz once every second.