Open joberdick-rcd opened 2 years ago
Hm. That is strange. Thanks for reporting.
The only suspicious place in this regard is WebhookContextManager
& WebhookContextManagerMeta
, which are the parents of kopf.WebhookServer
. It is possible that they cause multiple results yielded instead of max one — and thus consumed in admission_webhook_server()
. I never saw this happening before — but I will take a look.
@nolar Thanks. If you can't repro, I can try to provide setup/steps.
@joberdick-rcd Yes, I was not able to reproduce it. Everything works fine for me.
I used K3s 1.21.9+k3s1, Python 3.9.7, Kopf 1.35.3. The virtualenv was originally clean and populated via pip install certbuilder -e .
.
Can you please run kopf run
with -v
and share the logs? It might add some insights.
Also, can you clarify how are the SSL certs generated? Maybe something is wrong with the certs. I tried both ways: with the self-signed certs and with the same certs dumped and loaded from the file system. But if the certs are generated elsewhere, this can have an effect on something.
In particular, either that cert must be CA-capable, or the cadata=
/cafile=
must be specified to WebhookServer()
explicitly. Kopf's generated certs are CA-capable (link). (I am not sure about the certs issued by the system-trusted CA, but that should work too.)
@nolar I will work on getting you more info
@nolar here are steps to reproduce:
import kopf
@kopf.on.startup() def configure(settings: kopf.OperatorSettings, **_): settings.admission.server = kopf.WebhookServer() settings.admission.managed = 'something.something'
@kopf.on.validate('something.something',persistent=True) def inputvalidate(spec, **_): if 'something' not in spec: raise kopf.AdmissionError("Missing required value: spec.something")
kopf run test.py -v
[2022-04-28 10:20:26,498] kopf._core.reactor.r [DEBUG ] Starting Kopf 1.35.3.
[2022-04-28 10:20:26,498] kopf.activities.star [DEBUG ] Activity 'configure' is invoked.
[2022-04-28 10:20:26,499] kopf.activities.star [INFO ] Activity 'configure' succeeded.
[2022-04-28 10:20:26,499] kopf._core.engines.a [INFO ] Initial authentication has been initiated.
[2022-04-28 10:20:26,500] kopf.activities.auth [DEBUG ] Activity 'login_via_client' is invoked.
[2022-04-28 10:20:26,506] kopf.activities.auth [DEBUG ] Client is configured via kubeconfig file.
[2022-04-28 10:20:26,506] kopf.activities.auth [INFO ] Activity 'login_via_client' succeeded.
[2022-04-28 10:20:26,507] kopf._core.engines.a [INFO ] Initial authentication has finished.
[2022-04-28 10:20:26,520] kopf._cogs.clients.w [DEBUG ] Starting the watch-stream for customresourcedefinitions.v1.apiextensions.k8s.io cluster-wide.
[2022-04-28 10:20:26,522] kopf._kits.webhooks [DEBUG ] Generating a self-signed certificate for HTTPS.
[2022-04-28 10:20:26,662] kopf._kits.webhooks [DEBUG ] Listening for webhooks at https://*:42677
[2022-04-28 10:20:26,662] kopf._kits.webhooks [DEBUG ] Accessing the webhooks at https://localhost:42677
[2022-04-28 10:20:26,662] kopf._core.engines.a [INFO ] Reconfiguring the mutating webhook something.something.
[2022-04-28 10:20:26,665] kopf._core.engines.a [INFO ] Reconfiguring the validating webhook something.something.
2. In another terminal download istioctl and install istio to same cluster
curl -L https://istio.io/downloadIstio | sh - export PATH=$PWD/bin:$PATH istioctl install -y
Note the extra logs as Istio installs:
[2022-04-28 10:26:09,439] kopf._core.engines.a [INFO ] Reconfiguring the mutating webhook something.something.
[2022-04-28 10:26:09,447] kopf._core.engines.a [INFO ] Reconfiguring the validating webhook something.something.
[2022-04-28 10:26:09,494] kopf._core.engines.a [INFO ] Reconfiguring the mutating webhook something.something.
[2022-04-28 10:26:09,502] kopf._core.engines.a [INFO ] Reconfiguring the validating webhook something.something.
[2022-04-28 10:26:09,509] kopf._core.engines.a [INFO ] Reconfiguring the mutating webhook something.something.
[2022-04-28 10:26:09,514] kopf._core.engines.a [INFO ] Reconfiguring the validating webhook something.something.
[2022-04-28 10:26:09,520] kopf._core.engines.a [INFO ] Reconfiguring the mutating webhook something.something.
[2022-04-28 10:26:09,524] kopf._core.engines.a [INFO ] Reconfiguring the validating webhook something.something.
[2022-04-28 10:26:09,541] kopf._core.engines.a [INFO ] Reconfiguring the mutating webhook something.something.
[2022-04-28 10:26:09,549] kopf._core.engines.a [INFO ] Reconfiguring the validating webhook something.something.
[2022-04-28 10:26:09,555] kopf._core.engines.a [INFO ] Reconfiguring the mutating webhook something.something.
[2022-04-28 10:26:09,558] kopf._core.engines.a [INFO ] Reconfiguring the validating webhook something.something.
[2022-04-28 10:26:09,562] kopf._core.engines.a [INFO ] Reconfiguring the mutating webhook something.something.
[2022-04-28 10:26:09,565] kopf._core.engines.a [INFO ] Reconfiguring the validating webhook something.something.
[2022-04-28 10:26:09,598] kopf._core.engines.a [INFO ] Reconfiguring the mutating webhook something.something.
[2022-04-28 10:26:09,602] kopf._core.engines.a [INFO ] Reconfiguring the validating webhook something.something.
[2022-04-28 10:26:09,605] kopf._core.engines.a [INFO ] Reconfiguring the mutating webhook something.something.
[2022-04-28 10:26:09,608] kopf._core.engines.a [INFO ] Reconfiguring the validating webhook something.something.
[2022-04-28 10:26:09,619] kopf._core.engines.a [INFO ] Reconfiguring the mutating webhook something.something.
[2022-04-28 10:26:09,623] kopf._core.engines.a [INFO ] Reconfiguring the validating webhook something.something.
[2022-04-28 10:26:09,637] kopf._core.engines.a [INFO ] Reconfiguring the mutating webhook something.something.
[2022-04-28 10:26:09,642] kopf._core.engines.a [INFO ] Reconfiguring the validating webhook something.something.
I get the issue in k8s v1.21 in both EKS and Rancher Desktop. This issue exists if I create a server tunnel with my own certs also.
I've got similar behavior in GKE Standard Cluster and managed to trace the source of my "Reconfiguring" every 30 seconds: (not 100% sure)
This is the place where admission server is setup once: https://github.com/nolar/kopf/blob/ab53ace82e62a6fa709bf5a580007eac1273ac34/kopf/_core/engines/admission.py#L272
And is waiting for changes here: https://github.com/nolar/kopf/blob/ab53ace82e62a6fa709bf5a580007eac1273ac34/kopf/_core/engines/admission.py#L363-L364
I looked at what else might call container.set()
to disrupt this wait()
:
https://github.com/nolar/kopf/blob/ab53ace82e62a6fa709bf5a580007eac1273ac34/kopf/_cogs/aiokits/aiovalues.py#L38-L45
And for me it was this notify_all()
:
https://github.com/nolar/kopf/blob/ab53ace82e62a6fa709bf5a580007eac1273ac34/kopf/_core/reactor/observation.py#L184-L187
due to this: https://github.com/nolar/kopf/blob/ab53ace82e62a6fa709bf5a580007eac1273ac34/kopf/_core/reactor/running.py#L282-L285
and the events that trigger insights
are modification of google CRDs with names *.gke.io
. I saw at least three different names:
updateinfos.nodemanagement.gke.io
networkloggings.networking.gke.io
redirectservices.networking.gke.io
kubectl get crd -w
confirmed the changes are happening.
I don't have a slightest idea why they are modified every 20 to 40 seconds or what the diffs are.
P.S. I got the diff and the only field that's changing is resourceVersion
... (at least for networkloggings.networking.gke.io
)
Long story short
Activity in the cluster triggers the webbook to be reconfigured when it does not seem necessary. We observe many entries in the logs, even when the operator is not active working on anything. We are trying to find root cause.
Kopf version
1.35.3
Kubernetes version
1.21
Python version
3.9
Code
Logs
Additional information
The initial startup configures as expected. Webhooks are working as expected other than the excessive reconfiguring.