Open miskr-instructure opened 6 months ago
@miskr-instructure Sorry that you're running into this. It's definitely a potential risk if you start having a huge number of DopplerSecret
resources created. You can actually adjust this resync interval for each DopplerSecret
, which may help you here. This wasn't documented until today because we typically don't recommend that people go tweaking this unless they're running into the situation you are now. The field you want to adjust is called resyncSeconds
and you use it like this:
apiVersion: secrets.doppler.com/v1alpha1
kind: DopplerSecret
metadata:
name: dopplersecret-test
namespace: doppler-operator-system
spec:
tokenSecret:
name: doppler-token-secret
managedSecret:
name: doppler-test-secret
namespace: default
resyncSeconds: 120
Regarding the issue more broadly, we do have a SSE endpoint that the --watch
flag for our CLI uses and we've discussed potentially moving the operator over to this, but there are some architectural implications which have prevented it thus far. Longer term, that's definitely the direction we'd like to go though. Short term, adjusting resyncSeconds
should help!
Hi, 2 questions:
resyncSeconds
will increase the time it takes for (the very occasional) changes to be propagated, right?Short term, I think it'd be cool if the operator had an optional Service endpoint that was capable of receiving doppler's own POST webhooks (and marking appropriate doppler secrets for resync) - the ingress and network infra setup to make it reachable from doppler doesn't need to be included in the charts/manifests, that is easy to handle internally (and the webhooks are already authenticated with HMAC I believe).
We have been swapping out an in-house k8s reloading/secret handling solution to this official operator solution. Previously we used Doppler Webhooks for reloading, and an init script at container startup to load the current secrets.
What we have noticed since migrating is that we gets lots of these errors in the pod logs:
We have many DopplerSecret custom resources (but many of them reference the same Doppler Config actually). Despite there being many, they rarely ever change (on the frequency level of 1-2 changes per week), so we should not exceed the API rate limits.
It doesn't make any sense to me that the operator needs to HTTP-GET all secrets every N seconds from the Doppler API. It should be able to use a functionality similar to the Webhooks to do push-based reconciliation instead of polling Doppler. Doing so would drastically reduce the API load on Doppler.
I would propose one of two solutions:
ETag
andIf-None-Match
headers to decrease load on Doppler API (looks like Etag is already implemented) but increase the API rate limit specifically for HTTP-304 responses since they should be less expensive for Doppler than HTTP-200 reponses.