Open joeppeeters opened 1 year ago
Just my 2cts on possible ways to address this:
adding acknowledgements.nack_dropped = true
to the config spec. Easy to make backwards compatible, but having it as a global setting might not be desired.
extending the VRL definition with a fail
expression. This would allow the user to explicitly NACK
certain events, and drop
others.
Just dropping a quote from the discord thread that was linked
it's something we are working at to get the UX right. A lot of the time when events are dropped we don't want it to be a NACK, for example with filter, dedepe or throttle transforms the drops are very intentional. There is a fairly long running RFC here https://github.com/vectordotdev/vector/blob/bruceg/discarded-events-rfc/rfcs/2022-08-25-12217-handling-discarded-events.md that I think could be a good starting point for the considerations we are contemplating.
Noting that we had some discussion on this but didn't reach any conclusions yet.
The linked RFC should (if it doesn't) likely cover this subject of ack/nack behavior.
Wanted to also call out this option: https://vector.dev/docs/reference/configuration/transforms/remap/#reroute_dropped
, which creates a named output for the dropped events that can then be used downstream.
In general the current model we have with these three options drop_on_error
, drop_on_abort
and reroute_dropped
is convoluted and the behavior changes a lot when these three are used in different combinations.
While we might have a resolution to the NACK concern raised in this issue through the work of that RFC, we should consider whether an intermediate change in the short term would make sense.
A note for the community
Problem
context: We're a SaaS company providing a software suite to our customers. In that suite 100+, multi-lingual apps generate logs to for auditing purposes. We're running a project to improve the quality of those logs (=validate presence of required fields). The intended solution is to deploy a vector instance alongside every app to collect and process those logs and ship them to storage. Using an E2E vector pipeline we can use VRL to do some basic validation on the logs to see if they meet the quality standards. Being able to NACK them would yield instant feedback to the publisher such that issues can be spotted early on in development.
bug: Events which are aborted by VRL do not seem to signal a NACK at the source.
expected behaviour: All events which enter an E2E pipeline should end up in the sink before getting ack'ed
Configuration
Version
0.30.0
Debug Output
Example Data
docker run -d -v $PWD/vector.toml:/etc/vector/vector.toml:ro -p 80:80 timberio/vector:0.30.0-debian
curl -X POST -i -d '{"key1":"value1","key2":"value2"}' http://localhost:80
Additional Context
No response
References
Discussion on Discord