vmware-tanzu-labs / educates-training-platform

A platform for hosting interactive workshop environments in Kubernetes, or on top of a local container runtime.
https://docs.educates.dev
Apache License 2.0
63 stars 14 forks source link

Startup/liveness probe timeout to short. #475

Closed GrahamDumpleton closed 6 days ago

GrahamDumpleton commented 6 days ago

Describe the bug

The startup and liveness probes for the secrets manager and session manager probes use the default timeout of 1 second. For the secrets manager when there are a large number of workshops (and thus SecretCopiers), there is a risk that reconciliation after addition of new namespace/secret could take too long, blocking ability to handle the probe, due to fact that kopf is in part based on asyncio libraries and thus limited threaded parallelism. The result could be failure of probe and a restart, but restart could also fail if Kubernetes control plane running slow.

To provide more than enough buffer for such delays in handling the probes, increase timeout from 1 second to 5 seconds.

Period between probes can perhaps also be adjusted, along with startup delay.

Additional information

No response