vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
18.21k stars 1.6k forks source link

Requesting a local quota transform #21755

Open uni-pooja-laad opened 1 week ago

uni-pooja-laad commented 1 week ago

A note for the community

Use Cases

I am looking for a transform which allows to set a daily limit quota. My source is datadog agent present in k8s cluster and sink is datadog log UI. I need to send a limited number of logs per day. Example: 200 Millions logs per day. After this it should stop sending logs.

Please find the link of conversation I had with support team. https://discord.com/channels/742820443487993987/1298597363585253438/1298597363585253438

Attempted Solutions

No response

Proposal

No response

References

No response

Version

No response

pront commented 1 week ago

Thank you for this request, this would be a very useful transform. I am not sure when we can get to this but if someone is inspired to submit a PR here are some potential config UX:

    quota_0:
      inputs: [ "source_0", "source_1" ]
      limit:
        window: 12h 
        type: bytes # alternatively could be number of events
        max: 1TB # could support strings like this one, or just raw number

And we would have two ports, one for events that are under the quota and a dropped port for those who exceed the quota.

uni-pooja-laad commented 1 week ago

Any suggest on how to implement this? which transform I can use.

jszwedko commented 1 week ago

@uni-pooja-laad

Any suggest on how to implement this? which transform I can use.

The throttle transform is probably the closest to what a quota transform would look like. If you wanted to contribute this new transform, I think copying that one would be a good place to start.

uni-pooja-laad commented 1 week ago

Tried to test this: set_quota: type: throttle inputs: [input] limit: window: 12h type: bytes # alternatively could be number of events max: 1TB # could support strings like this one, or just raw number But error is: 2024-11-15T09:16:45.102027Z ERROR vector::cli: Configuration error. error=unknown field limit, expected one of threshold, window_secs, key_field, exclude, internal_metrics

in transforms.set_quota Not sure, if this can work: set_quota: type: throttle inputs: [input] threshold: 10000 window_secs: 43200

jszwedko commented 1 week ago

Tried to test this: set_quota: type: throttle inputs: [input] limit: window: 12h type: bytes # alternatively could be number of events max: 1TB # could support strings like this one, or just raw number But error is: 2024-11-15T09:16:45.102027Z ERROR vector::cli: Configuration error. error=unknown field limit, expected one of threshold, window_secs, key_field, exclude, internal_metrics

in transforms.set_quota Not sure, if this can work: set_quota: type: throttle inputs: [input] threshold: 10000 window_secs: 43200

Sorry, I meant that if you wanted to contribute a quota transform that the throttle transform could be used as a reference during implementation; not that you can use the throttle transform to apply a quota.

However if you would like to use the throttle transform for some other purpose and are having trouble configuring it, please open a GitHub Discussion.