Closed nkvoll closed 3 years ago
+1
I've run into a similar issue with a controller I'm writing. I wound up starting a goroutine when the controller gets added to watch/act on certain objects that are created and labeled by another controller outside of mine.
we'd have to specify this at the manager level, but it should be possible to pass restricting items.
(since caches are shared)
we could also let users manually specify auxiliary caches
if anyone has ideas for a design doc, I'd love to see them.
/kind feature /priority important-soon
this has been coming up a lot lately, and seems like a good thing to tackle in the list of related issues.
i.e. one sketch:
Have a list-options watch predicate, smartly detect the presence of that (in some generic way) and pass it down to GetInformer
. If we already have a global informer, just use that. If we don't have a global informer, create a new filtered informer. We delay creation of informers to figure out if one person needs a global informer, and just use that, to allow splitting and combining seamlessly. The question is how to deal with the cached client:
That last one might solve multiple problems in one fell swoop, but I'd need to see a design sketch to make sure that it's not too arcane.
I will follow this to understand how this is properly implemented and any questions from a user or testing perspective please ask and I'll do my best to assist. In the interim I've implemented this for my own use case.
kvp := &keyValuePair{
key: "app",
value: "test",
}
filter := newFilter(keyValueFilter(kvp))
err = c.Watch(&source.Kind{Type: &v1.Deployment{}}, &handler.EnqueueRequestForObject{}, filter)
if err != nil {
return err
}
type keyValuePair struct {
key string
value string
}
var kvp keyValuePair
type filterFuncs struct {
// Create returns true if the Create event should be processed
CreateFunc func(event.CreateEvent) bool
// Delete returns true if the Delete event should be processed
DeleteFunc func(event.DeleteEvent) bool
// Update returns true if the Update event should be processed
UpdateFunc func(event.UpdateEvent) bool
// Generic returns true if the Generic event should be processed
GenericFunc func(event.GenericEvent) bool
Filter keyValuePair
}
func keyValueFilter(kvp *keyValuePair) func(*filterFuncs) {
return func(f *filterFuncs) {
f.Filter.key = kvp.key
f.Filter.value = kvp.value
}
}
func newFilter(opts ...func(*filterFuncs)) *filterFuncs {
f := &filterFuncs{}
for _, opt := range opts {
opt(f)
}
return f
}
func (p filterFuncs) Create(e event.CreateEvent) bool {
labels := e.Meta.GetLabels()
if val, ok := labels[p.Filter.key]; ok {
if val == p.Filter.value {
return true
}
}
return false
}
That's just like a predicate, though -- it doesn't actually get you most of the benefits of the filtering that people are asking for here, since it doesn't ask for server-side filtering.
I know... As I said above the code block
I will follow this to understand how this is properly implemented and any questions from a user or testing perspective please ask and I'll do my best to assist. In the interim I've implemented this for my own use case.
Ack, just wanted to make sure we were on the same page :-)
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten
/remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen
.
Mark the issue as fresh with /remove-lifecycle rotten
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close
@fejta-bot: Closing this issue.
/reopen /remove-lifecycle rotten
@alvaroaleman: Reopened this issue.
bumping this feature request:
allow users to turn off auto-caching, make them explicitly get filtered caching clients if they want (doable, more work for users). This potentially lets us allow users to specify that they don't want the client to ever cache a particular object (could encourage wrong patterns).
I think this is a viable option, but I wonder if we could have a more straightforward pattern like a user specifies a watch with a label. If they do, then we use the label-filtered cache for that specific GVK. If another watch is created without that label-filter, then we will error out and not establish the watch. The most significant caveat that I see is the client will only use the label filtered cache. We would need to make that clear in the documentation, or is there some other mechanism to alert the user to this.
the main issue with that approach is that it makes it so that certain controllers co-habitating suddenly breaks things, which is less than optional. Still, I'm not sure that there's a good solution here, so we may have to go with the least-sub-optimal solution
What if we made the restriction explicit on the manager or during cache creation? Does this feel like the restriction on a namespace that we do now?
yeah, that could work, potentially, if it solves your usecase
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
I have a similar need for being able to specify namespace for watches (similar to https://github.com/kubernetes-sigs/controller-runtime/issues/124#issuecomment-417438810). I've been playing around with that in #692, but it could perhaps be made more generic than what is there at present. My main motivation for watching per namespace is to ensure I can grant controllers only the minimal permissions they strictly need (i.e. filtering by namespace client-side is not sufficient)
@DirectXMan12 re https://github.com/kubernetes-sigs/controller-runtime/issues/244#issuecomment-451336202: Would it be another option to make the cache Reader smartly detect options that could be used to filter informers (namespace, labelSelector, fieldSelector), check if we have a cache for those already, if yes use it, if not fall back to unfiltered cache (which means creating it if it doesn't exist yet)?
How about an API something like this:
client.Watch(source.NewKind(&v1.Pod{}, client.InNamespace("namespace1")), ...)
client.Watch(source.NewKind(&v1.Pod{}, client.InNamespace("namespace2"), client.MatchingLabels(labelMap), ...)
The options are passed down to the informer map which will create additional caches if they are not covered with any existing informer.
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten
/lifecycle frozen
I'll take a stab at a design sketch that I'd love to get more feedback on. I have split the requirements in 2 (I use the term 'filter' to mean watch/list filters like namespace, labels, field selector etc.):
For the first requirement, there is not really any API change needed IMO. However, for the second requirement, there are a few corner cases that must be taken into account.
Lets assume watch A and B with different filters are created (in order) for the same GVK. Then, the following can happen:
Resolving the conflicts by returning an error may be sufficient for most cases. If not, we need a good way to signal users of error conditions.
(IMO, the best user experience would be if it was possible for multiple informers to share the same cache. That way, as long as duplicate cache events were handled properly (i.e. using metadata.resourceVersion?), one could use the same cache client.)
Create mapping of GVK -> Filters when creating the cache/manager. A new type DelegateCache could take the following mapping as parameter:
NewDelegateCache(defaultFilters []Filter, gvkToFilter map[schema.GroupVersionKind][]Filter) Cache
The Filter
type would basically be something like ListOpts and give the ability to restrict things like namespace, labels etc. for different types.
With this design, the cache filters are the 'hard' limits on what can be filtered on, and its not possible to use conflicting filters for the same GVK.
This would also not require an API change. When creating watches, the pre-configured filters will be used depending on type.
More specific filtering can be done using predicates as today.
It does not solve the second requirement of using different filters for the same GVK.
I think a modification to the API is required in order to allow for different filters for the same GVK. One could extend the current controller api with the ability to supply filters, something like:
controller.Watch(source.NewKind(&v1.Pod{}, filter.InNamespace("namespace1"), filter.MatchingLabels(labelMap), &handler.EnqueueRequestForObject{})
Moreover, the cache/manager may be configured with default filters (i.e. replace 'namespace' with a list of filters).
The cache implementation can examine the filters provided in GetInformer and make a decision on what to do according to the different corner cases. Alternatives for conflict resolution if existing informers cannot be used:
m, err := manage.New(cfg, manager.Options {
defaultFilters: []Filter{},
})
...
c.Watch(source.NewKind(&v1.Pod{}, source.InNamespace("namespace1"), ...)
c.Watch(source.NewKind(&v1.Pod{}, source.InNamespace("namespace1"), source.MatchingLabels(labelMap)), ...)
The first Watch will create a new cache with the filters. The second watch could use the same informer and create predicates to do client-side filtering.
...
c.Watch(source.NewKind(&v1.Pod{}, source.InNamespace("namespace1"), source.MatchingLabels(labelMap)), ...)
c.Watch(source.NewKind(&v1.Pod{}, source.InNamespace("namespace1")), ...)
The first Watch will create a new cache with the filters. Depending on the conflict resolution, either return error or return handle of new cache to client.
c.Watch(source.NewKind(&v1.Pod{}), ...)
c.Watch(source.NewKind(&v1.Secret{}, source.InNamespace("namespace1")), ...)
The first watch will just use the defaults, the second will create a new cache and return handle. However, since its a different type, one could have the cache create a delegated mapping.
@shawn-hurley @alvaroaleman @DirectXMan12 thoughts?
I like the idea of on-demand delegation, I am worried that implementation is going to get really messy to handle the use cases.
Watching the same GVK with conflicting watches due to order of creation
So if the first call to watch is more restrictive then the second case, we will error out?
What does it look like if we try and handle it?
So if the first call to watch is more restrictive then the second case, we will error out?
What does it look like if we try and handle it?
The alternatives I can think of are:
@lulf The tricky part here is not so much how exactly we change the cache, its mostly about finding a good solution for the cache reading client to determine which cache to use, see https://github.com/kubernetes-sigs/controller-runtime/issues/244#issuecomment-451336202
Apart from the options Solly outlined, I also suggested this:
Would it be another option to make the cache Reader smartly detect options that could be used to filter informers (namespace, labelSelector, fieldSelector), check if we have a cache for those already, if yes use it, if not fall back to unfiltered cache (which means creating it if it doesn't exist yet)?
I personally would like to see a gdoc that outlines the ideas we have on how the cache reading client knows which cache to use and ignores how we change the cache itself or the watch calls, as that is the easy part. We can then discuss the options in detail on the gdoc.
/kind design
@lulf Would you be able to open a PR against controller-runtime with a design document?
My team is also interested in this feature. We have a controller that watches a small set of secrets. However, overall the cluster has many (also larger) secrets which increases our pod requests/limits drastically, because they will all be cached. In our use case, using a separate cache for each filtered watch would absolutely suffice, because the number of secrets we are interested in is small. Therefore, we only have very little additional memory overhead because of duplication. @lulf Is there any ongoing work atm?
@jkbschmid I've not found time or need to work on this since my last proposal. We have a workaround 'delegating cache' that solves a subset of the requirements (i.e. we can at startup time set which GVKs we want to watch globally)[1].
FYI if you haven't seen this already: if you know which namespaces you need to watch up front, you can have a look at the multiNamespaceCache in pkg/cache/multi_namespace_cache.go
[1] https://github.com/EnMasseProject/enmasse/tree/master/pkg/cache
My team is also interested in this feature. We have a controller that watches a small set of secrets. However, overall the cluster has many (also larger) secrets which increases our pod requests/limits drastically, because they will all be cached. about 100MB per 1k secrets. Is there any plan to fix/implement it in the near future?
Working on this. Will write up a doc and discuss at the first meeting after it is drafted.
/assign
I think a prototype with docs/tests is a good way to kick off the discussion on choosing one solution over the other, and would give me/us a chance to understand limitations of a particular solution. This problem is hard to reason about without some code to back up a solution.
This is more or less what everyone seems to be saying about informer creation (from https://github.com/kubernetes-sigs/controller-runtime/issues/244#issuecomment-451336202):
Have a list-options watch predicate, smartly detect the presence of that (in some generic way) and pass it down to
GetInformer
. If we already have a global informer, just use that. If we don't have a global informer, create a new filtered informer. We delay creation of informers to figure out if one person needs a global informer, and just use that, to allow splitting and combining seamlessly.
As for the cache Reader
solution, I like the idea of smartly detecting which cache to use based on selectors/namespaces supplied to Get()/List()
, like @alvaroaleman suggested:
make the cache
Reader
smartly detect options that could be used to filter informers (namespace, labelSelector, fieldSelector), check if we have a cache for those already, if yes use it, if not fall back to unfiltered cache
If anyone would like to expand/comment on the above two parts of the prototype, or voice any objects, feel free.
I really liked the idea made in the comment https://github.com/kubernetes-sigs/controller-runtime/issues/244#issuecomment-572978846 +1.
controller.Watch(source.NewKind(&v1.Pod{}, filter.InNamespace("namespace1"),
filter.MatchingLabels(labelMap), &handler.EnqueueRequestForObject{})
@estroz Is this still progressing? We are keen to see this feature enabled.
@taylormgeorge91 I haven't had as much time as I'd like to work on this, but I did just post a rough prototype (#1178). Feel free to critique.
I also have a use case of watching pods running on a node and not watch all the pods in the cluster. This will limit the number of kubernetes resources needs to cached/watched and processed. My use case is of having daemon-set watching pods on current node, hence I am looking for a solution like having a fieldSelector for the resources being watched/cached and reconciled.
I see that kubernetes client-go library provides similar functionality with func NewListWatchFromClient(c Getter, resource string, namespace string, fieldSelector fields.Selector) *ListWatch in package
It would be very useful to have this feature which I think is already supported by underlying kubernetes client-go library. This will also help in reducing cache, memory footprint and CPU requirements when there are huge number of pods in the cluster.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/remove-lifecycle stale
Any updates on this?
I have try a PR to specify a fieldselector at manager build time https://github.com/kubernetes-sigs/controller-runtime/pull/1404.
New PR: #1435
It is being worked into a release currently @invidian
Looking at the tags there is a v0.9.0-beta available if you want to test.
Looking at the tags there is a v0.9.0-beta available if you want to test.
Thanks, will do!
When setting up watches during initialization it's currently not possible to filter by any selectors (which is possible using list watches).
For example it is not possible to only watch pods with specific labels (e.g having the label
pod-type: my-controller-type
). The current behavior results in very broad caching, which might not be desirable for large Kubernetes deployments.In some scenarios an operator could contain multiple controllers, and they all share caches, so keying caches on GVK's alone might be problematic if they want to watch the same resource type, but with different filters.
When doing
List
/Get
, how would one decide which of the caches to use? It seems that perhaps this needs to be an explicit choice by the operator developers?