Open pierzapin opened 8 months ago
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
The sampling_priority
configuration option will hold the string
name of a log attribute. The processor will get the log attribute's value and then use it as the sampling rate, overriding the value of sampling_percentage
. Here's the code for reference.
from_attribute
will determine which log attribute's value will be used as the hashing value, it's unrelated to the sampling rate. The processor will get the attribute value and then compute a hash on it, then use the sampling priority (either set by sampling_percentage
or the log attribute sampling_priority's
value) to determine if the log itself is sampled or not.
I agree this could be made more clear in the README, a PR would be welcomed!
Should this be closed?
Should this be closed?
I don't think so, my PR was mostly unrelated to this. We still need to update the README to close this, from my understanding.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
@kentquirk , would you be able to transform your comment in documentation at the readme for the component?
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
Component(s)
processor/probabilisticsampler
Describe the issue you're reporting
It's not very clear how the sampling_priority configuration in the probabilistic_sampler is intended to work. The readme bullet point for this processor just above the Hashing heading states:
which is somewhat ambiguous - is this meant to be a string (i.e. an attribute name as stated) or a int between 0 and 100?
My initial expectation based on the readme and the example config here was that this setting works in tandem with the from_attribute to provide some sort of override mechanism to the blanket sampling rate. i.e:
Which would suggest that if the data included the attribute "foo" that it'd be sampled at 100%
However testdata/config.yaml#L43 implies that another attribute name is used to drive the sampling_priority. I have no idea how this would work if the attribute value is itself a string?
I'd be happy to contribute wording updates based on your advise about which way this is intended to function.