netobserv / flowlogs-pipeline

Transform flow logs into metrics
Apache License 2.0
73 stars 23 forks source link

Add `nop` operation to aggregate #135

Closed ronensc closed 2 years ago

ronensc commented 2 years ago

Currently in the extract/aggregate stage we group entries by some field and apply an aggregate function on the group. We support sum, min, max, avg and count But, there are cases where we are interested in grouping entries without applying the aggregate function (nop - no operation). We want to retain the original value of the entries.

One example for such case is histograms in prometheus. We need the original values so we could put it in the right buckets. Actually, counters in promethues needs this too because their API accepts only deltas and not the final count/sum.

The current implementation saves the original value of the entries in the recentRawValues field. https://github.com/netobserv/flowlogs-pipeline/blob/cbf2c9317792eb3ae42d36a5457c2829ea3da897/pkg/pipeline/extract/aggregate/aggregate.go#L176-L177

Adding nop will allow us to merge the recentRawValues field into the value field.

ronensc commented 2 years ago

cc @eranra

ronensc commented 2 years ago

To conclude, if the aggregate operation is one of: max, min, avg, sum, count the prometheus metric should be Gauge

If the aggregate operation is nop, the prometheus metric should be Counter or Histogram. The valuekey field in the flowlogs-pipeline.conf.yaml will determine which field will be used.

There is still an edge case where we want to use count functionality with Prometheus Counters (rather than Gauges). In this case we need to add a new field recent_raw_count which is the count of the slice and in the prometheus encoder to handle both cases (slices and counts). An alternative option is to add a transformer that adds a field with constant value of 1. Then we can use this 1s in the slice.