Open NeilJed opened 11 months ago
in the current implementation, rate is just a uint, which makes it impossible to set a simple sample rate of 2/3 (or any other fraction with numerator != 1) which might come handy in some cases. i suggest setting the rate to an actual rate (a float between 0.0 to 1.0 which describes a frequency) with the option to set it as a fraction (1/1000 or 0.001 should both work the same)
A note for the community
Use Cases
Currently with the sample filter, it only accepts an integer as the input for the sample rate. It would be really good if we could set this via a field or VRL snippet.
The use case I have is that my event messages contains a field denoting the service that the log belongs to. I'm using this as the key field for hashing. However as event volume varies a lot, a fixed sample rate doesn't fit my use case.
What I would like to do is be able to set the sample rate based on a condition, in this case the value of the service field or even a event metadata value (if the sample rate is pre-caclulated and added in a previous transform).
For example:
Attempted Solutions
Currently the way I have implemented this is to use a VRL transform to decide the sample rate based on the service name and add it as an even field. That then passes to a route transform with a route based on the value of that field. Those then send to multiple sample transforms that have a fixed sample rate. Those are then all collected into the sink.
Obviously this is overly complex and requires me to create a route + sampler for every sample ratio I want.
Proposal
It seems there already existing function to support static/field/vrl input where the result must be a boolean. Could this approach be appled to add an input type where the result must be an integer? Maybe
Option<u64>
?References
No response
Version
vector 0.34.1 (x86_64-apple-darwin 86f1c22 2023-11-16 14:59:10.486846964)