Open swiatekm opened 7 months ago
some thoughts:
apiVersion: opentelemetry.io/v1alpha2
This will let us take advantage of the work for #901 (embedding a common fields struct)I think this should only be in v2 i.e. apiVersion: opentelemetry.io/v1alpha2 This will let us take advantage of the work for https://github.com/open-telemetry/opentelemetry-operator/issues/901 (embedding a common fields struct) Having this in v2 only would let us begin with some fields in the TA spec deprecated or removed
Agreed, although I'm not sure I want it to tie enabling it to v2 release. We can always release v2 collector and enable the TA CRD in a later version.
I think there will need to be a reference from a target allocator to a collector. We should improve this not only for the TA but also for instrumentation by introducing a new type or field called "collectorRef" or something like that which the operator can validate and automatically set up the networking For the ownership of the CRD, I think we could set a controller reference from the collector to properly cascade a delete, what would happen if someone deletes their TA? I don't think the relation needs to be bidirectional, it should be enough to infer the relationship from the TA to the collector by doing the validation i mentioned above. i.e. we get a request to reconcile a TA CRD at which point we validate the collector ref and then update the collector manifests (read: config) appropriately
Wouldn't it be simpler to have an optional targetAllocatorRef
field on the Collector CR, and set the Collector CR as the owner of the TargetAllocator CR? Then you get the cascading delete and can know which Collector owns the TA from ownerReferences
. Then we could update Collector config in Collector reconciliation, which sounds much simpler.
From a usability POV, I think it's important to be able to tell just from the Collector CR whether the Collector as a TA associated.
for the rollout plan, i blieve we would need to enable webhooks and reconciliation before we start generating the TA CR from the subresource, unless we wanted to support both the subresource and the CRD. I think we should only support one at a time so I think the rollout could go like this:
I meant that we can generate the TA CR, but not actually apply it, but instead add it to the context and use it to generate TA manifests. This way our manifest generation would already be using the same logic as for "real" TA CRs, and the only change would be to actually apply the CR once we're ready to enable it. Does that make sense? I'll put up a draft PR showing what I have in mind later.
This way, it'd be easy to support the CR and the subresource simultaneously, which we should do in my opinion. It might even make sense to support the subresource indefinitely for users who just need a simple configuration and don't want the complexity of dealing with multiple CRs.
Wouldn't it be simpler to have an optional targetAllocatorRef field on the Collector CR
+1
I've thought about this a bit more, and I think I've arrived at a solution to the two-way binding problem. Our requirements are:
I propose that we use the standard label matching mechanism for this, and also provide a shortcut via an annotation.
To directly set a collector selector:
apiVersion: opentelemetry.io/v1alpha2
kind: TargetAllocator
metadata:
name: example
spec:
collectorSelector:
matchLabels:
app.kubernetes.io/instance: example-collector
...
Or alternatively, via an annotation:
apiVersion: opentelemetry.io/v1alpha2
kind: TargetAllocator
metadata:
name: example
annotations:
targetallocator.opentelemetry.io/collector: "example-collector"
Then the resolution flow for the Collector CR with a TargetAllocator reference becomes:
The Collector CR depends on the TargetAllocator CR in this flow, but not vice versa.
Confirming I support this idea! After our discussion in the SIG, I think this is a good path forward. For now, I think we should just begin with setting the collector setter and open a separate issue for the annotation to be used as a convenience after some mileage with the new types.
I'm a bit stuck on the shortcut mechanism for connecting OpenTelemetryCollector and TargetAllocator CRs. The approach from https://github.com/open-telemetry/opentelemetry-operator/issues/2422#issuecomment-1862635932 works, but I don't like its ergonomics.
The requirements for this connection are the following:
So in principle, the collector needs to know which target allocator it's connected to, and the target allocator also needs to know which collectors it's assigning targets to.
If we use an annotation or label for this, we need to decide what the source of truth is. What happens if a user sets an annotation, and also sets their own collector selector in the TargetAllocator? Which setting wins, or should we just error and refuse to reconcile?
The other question is where the annotation or label should be placed. On both CRs would be most explicit, but also a bit redundant and probably a bit more annoying for users. Putting it on only one of the CRs is sufficient, especially if it's a label, but then the other CR has its behaviour altered without any change to the CR itself. Maybe this is fine? This is how EndpointSlices are assigned to Services, for example.
Discussed the above during the SIG meeting. The conclusions were:
I realized that the above results in some awkwardness with the embedded TargetAllocator, as we'd need to add a label to the Collector CR while reconciling it, despite the Collector CR owning the TargetAllocator CR. But I think we can just use owner references instead of the label in that case.
Component(s)
target allocator
Describe the issue you're reporting
The Target Allocator subresource of the OpenTelemetryCollector CR has been growing in size over time, in particular due to various infrastructure-related attributes that users rightfully want to set for it. It’s time to split it into its own CRD, improving separation of concerns and reducing the size of the Collector CRD.
Design
The TargetAllocator CRD should contain all the fields the Target Allocator subresource currently does, to begin with. We will later continue adding fields to it, while possibly deprecating some fields from the subresource. The subresource should always only contain a subset of the CRD fields. Internally, the operator will create a CRD based on the subresource, which is how we’ll maintain backwards compatibility.
Relation to the Collector CR
A Collector can be configured to use a given target allocator by setting a label on the collector CR:
Collectors and Target Allocators are independent, neither owns the other, and each can in principle run on their own. By establishing a connection between them, certain aspects of their configuration will change under the hood.
Rollout
We should be able to roll out most of the necessary changes without any user-facing change: