Open kvaps opened 4 years ago
You could effectively get this now, by turning of dependent watches https://github.com/operator-framework/operator-sdk/blob/master/doc/ansible/dev/dependent_watches.md
WDYT?
Hi @shawn-hurley, thanks for answer, but I'm not sure that this option can help somehow in this case.
Eg if I have 100 resources, ansible-operator will process all of them during the restart of the operator itself, I want to prevent it somehow, because I'm planning to have a lot of similar resources, and I want to trigger them only on change.
Working prototype:
- gather_facts: no
hosts: localhost
vars:
res: "{{ vars.values() | selectattr('apiVersion', 'defined') | first }}"
metadata: "{{ res.metadata }}"
api_version: "{{ res.apiVersion }}"
kind: "{{ res.kind }}"
status: "{{ res.status | default({}) }}"
tasks:
- meta: end_play
when: "status.generation is defined and status.generation|int == metadata.generation|int"
- debug:
msg: do_something
- k8s_status:
api_version: '{{ api_version }}'
kind: '{{ kind }}'
name: '{{ meta.name }}'
status:
generation: "{{ metadata.generation }}"
I think this is likely not something we'll add, because there's no guarantee that the last reconciliation finished/succeeded or that cluster state has remained static since the operator restarted, so we want to ensure we reconcile on start. This may be a good example of where a hybrid operator could be the right pattern, since you could just override the reconcile logic to take resourceVersion into account and then reuse everything else.
That being said the overhead is definitely a problem, bumping up the number of parallel jobs (as raised in operator-framework/operator-sdk#1678) and disabling fact gathering (as raised in operator-framework/operator-sdk#1677) should help bring that down, but if the performance is still unsatisfactory we should definitely focus some time on profiling/improving that.
Maybe can we prioritize it somehow, to put dropping events above initial and reconcile tasks?
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
This issue might be solved by implementing watch bookmarks support https://github.com/operator-framework/operator-sdk/issues/1939
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
/remove-lifecycle stale
Watch bookmarks do look promising, if we can do this in a supported way then I'm all for it
Closing in favor of operator-framework/operator-sdk#1939, which solves the issue described here in another manner.
I think should be re-opened.
Bookmarks are only useful when re-establishing a watch. That would not be the case when the operator container is first starting up. On startup, it always establishes new watches, because nothing is persisting a collection's resource version.
https://kubernetes.io/docs/reference/using-api/api-concepts/#watch-bookmarks
Further, bookmarks are only useful when you are watching with a label selector, which I don't think is supported yet by the sdk anyway.
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle rotten /remove-lifecycle stale
/lifecycle frozen
Feature Request
Problem
Every task execution is taking about 3s, even just simple playbook:
Eg if you have 100 custom resources, and then you restart operator, it will be able to process new operations only after 5 minutes, after processing all 100 existing resources.
Solution
Allow specifying
preserveStatus: true
option on a par withreconcilePeriod: 0
in watches.yaml file:Processing:
If
preserveStatus
is set totrue
, then savemetadata.generation
tostatus.generation
for each resource.Initialization:
During start do the check: if
metadata.generation
equalsstatus.generation
, then skip resource processing.