Exemplar Reservoir Offer should be allowed to only accept dropped attributes

MrAlias commented 1 year ago

Currently the exemplar reservoir Offer method is defined as follows in the specification:

The "offer" method SHOULD accept measurements, including:

[...]

The complete set of Attributes of the measurement.

[...]

However, the reservoirs are only expected to produce exemplars that hold the attributes that have been dropped from a measurement:

Exemplars MUST retain any attributes available in the measurement that are not preserved by aggregation or view configuration. Specifically, at a minimum, joining together attributes on an Exemplar with those available on its associated metric data point should result in the full set of attributes from the original sample measurement.

Requiring the Offer method receive all attributes when only the dropped will be preserved is inefficient if the implementation already knows the dropped attributes at the time Offer is called. It means the exemplar implementation will need to independently determine the dropped attributes.

Background

The Go SDK filters measured attributes at the time they are recorded. This ensures that attribute filters can be utilized to restrict attributes for instrumentation with cardinality explosion and have it safely prevent run-away memory use.

In the process of filtering these measured attributes the calculation of the dropped attributes is trivial and also known. Based on the current specification, however, the Go implementation cannot simply provide these dropped attributes to the exemplar reservoir Offer method.

Proposal

Update the specification to state that the attributes passed to Offer need to be either the dropped attributes or the complete set based on the implementation (but not both).

What about filtering with the kept attributes?

If the reservoir does not receive the complete set of attributes that are being recorded during a measurement it cannot filter based on these attributes. Therefore, this would reduce the functionality of the reservoir, right?

There are two points to be made here:

The exemplar filter would continue receive the complete set of attributes and could be used to filter on these attributes.
The reservoir could be made aware of the attributes that will be retained upon creation.

The second point is possible because all exemplar reservoir act as a reservoir for a distinct set of attributes. Given this one-to-one relationship when an exemplar reservoir is create it could be provided the set of attributes that will be retained and for each measurement it will be provided the set of attributes dropped. Therefore, it will have have access to the complete set of attributes.

If the functionality to filter on the complete set of attributes needs to be retained by the reservoir, an implementation can still provide this while still optimizing including the measurement process optimization of only receiving the dropped attributes.

jack-berg commented 1 year ago

Requiring the Offer method receive all attributes when only the dropped will be preserved is inefficient if the implementation already knows the dropped attributes at the time Offer is called.

I was actually looking closely at the java implementation of exemplars just a few weeks ago, and starting a path down refactoring to do exactly what you're describing. However, after looking closely at the implementation, I decided that for the java implementation, the existing behavior was in fact optimal:

When a measurement is recorded, we receive a set of attributes
Prior to aggregation, we filter this set down to the final set as required by the view.
At this point, we have allocated memory for the full original set, and the new filtered set.
We pass the the full original set to the exemplar filter, and store it with the the exemplar in the reservoir.
When a collection occurs, we compute the difference between the set associated with each stream and the original recorded with the exemplar. Notably, this computation / allocation is done off the hot path.

For java, computing the diff of the full original set and the filtered set requires additional computation / allocation for each measurement passed to the exemplar filter, and its done on the hot path. In contrast, by doing it at collect time does the computation / allocation once for each recorded exemplar (as opposed to once for each measurement), and does it off the hot path. Not sure how the go implementation of attributes works, but if you treat them as an immutable set of key value pairs you might see the same thing we see in the java implementation.

Attribute implementations do differ though so I would support this change to offer implementations flexibility. Perhaps we can pass both the original attributes and the diff set, just in case there is any reason why a filter might want to base some behavior off the full original set.

MrAlias commented 1 year ago

if you treat them as an immutable set of key value pair

Yeah, this is the difference. The passed attributes are ordered in place with the same backing memory, the dropped attributes are placed at the end. Because of this, we are able to just pass reference to those dropped attributes without any computational or memory overhead.

Perhaps we can pass both the original attributes and the diff set, just in case there is any reason why a filter might want to base some behavior off the full original set.

Given Go slices are just references to the same underlying data this should work for us. It might be a bit of a messy API though.

jsuereth commented 1 year ago

This is an important bit of discussion to have here. Here's a few points rattling in my brain.

OTLP does show only dropped attributes. This is kind of an optimisatoin of Exemplars in the protocol.
offer is responsible for sampling measurements. As such, it needs to make a decision which exemplars to keep.
CURRENT sampling techniques involve randomness or bucket boundaries for histograms.

With only those facts, I can understand wanting to limit things, however:

We could provide a reservoir sampler that maximizes entropy by looking at diverse attribute sets.
We could provide a sampler that ignores measurements with a given attribute.

I'm not positive we've "finished" the exemplar specification enough to lock down attributes at this point. Specifically I planned to open the reservoir interface to allow users to provide their own implementation/sampling and I think we'll want all attributes available on offer for that.

MrAlias commented 1 year ago

We could provide a reservoir sampler that maximizes entropy by looking at diverse attribute sets.

We could provide a sampler that ignores measurements with a given attribute.

These are good examples of use where the complete set it needed, thanks for including them!

I still think the exemplar reservoir can be structured to handle both of these cases and only receive the dropped attributes on Offer. As mentioned, since the exemplar reservoir is used per-aggregation per-attribute-set, this means the implementation, and any user defined, exemplar creation function needs to take the following form:

type ExemplarFunc func() ExemplarReservoir

That way when the SDK determines a new aggreation/attribute-set pair it can create a reservoir for that group of measurements.

This also means that given these exemplar reservoirs are created when a new aggregation/attribute-set combination is determined the retained attributes will already be known (assuming filtering on input). This means that the implementation, and any user defined, exemplar creation function could just as easily take this form:

type ExemplarFunc func(retained attribute.KeyValue) ExemplarReservoir

Whenever a measurement is made for the aggregation the retained attributes will be computed, and as mentioned above the dropped can be a natural byproduct. The aggregation will use the retained attributes to determine what value it needs to aggregate with its state and it can simply Offer the reservoir of that state the dropped attributes:

reservoir.Offer(ctx, value, droppedAttr)

That reservoir will know the complete set of attributes at this point because it knows the non-dropped attributes that were provided on creation. This means it can then handle the more complex filtering operations mentioned.

jsuereth commented 1 year ago

@MrAlias That looks promising too. I think the question now is can we craft the specification so you don't need to include all attributes in Offer as long as the reservoir can reconstitute them somehow.

This would allow the existing Java implementation to remain as-is while allowing Go to optimise as it desires and keep the future use cases possible.

open-telemetry / opentelemetry-specification