elastic / logstash

Logstash - transport and process your logs, events, or other data
https://www.elastic.co/products/logstash
Other
14.09k stars 3.48k forks source link

Comparison conditionals crash pipelines opaquely when field on event is not correct value type #16007

Open yaauie opened 3 months ago

yaauie commented 3 months ago

Logstash information:

Please include the following information:

  1. Logstash version: main, 8.x, 7.x`
  2. Logstash installation source: any
  3. How is Logstash being run: any

Plugins installed: default

JVM (e.g. java -version): any

OS version (uname -a if on a Unix-like system): any

Description of the problem including expected versus actual behavior:

When a field reference is provided as a value-source in a conditional clause with a comparator operator (e.g., >, >=, <=, <), and the conditional encounters an event whose value at that field reference is not comparable to the constant-value on the other side of that comparator, the conditional crashes opaquely, propagating up to crash the pipeline.

Steps to reproduce:

input { stdin { codec => json_lines } }

filter {
  if [path][to][value] > 100 {
    mutate { add_tag => "hit" }
  } else {
    mutate { add_tag => "miss" }
  }
}

output { stdout { codec => json_lines } }

any of the following inputs will crash the pipeline, some with more-opaque messages than others:

While some comparisons don't make sense to make,

jsvd commented 3 months ago

Related issues:

andsel commented 2 weeks ago

@yaauie do you think that once an if breaks, the event should be cancelled or tagged with a label like _error_evaluating_if. In the first case the event is simply dropped but in the second case there is the possibility for the pipeline to recover and route the event somewhere that could be re-processed in a second phase.

jsvd commented 1 week ago

We have avoided this for a long time given it's hard to take an action that is always correct. Skipping the conditional can easily lead to broken behaviour; for example:

filter {
  if [path][to][value] > 100 {
    mutate { add_tag => "hit" }
  }
}
output { elasticsearch {} }

If the user forgets to add a conditional on the output to skip the tags, then an event {"test": 1} will be sent to elasticsearch without the right transformation being applied, potentially leading to mapping conflicts or even other errors during the filter stage.

Once the event fails a conditional it should be moved out of the rest of the pipeline. The options are: 1) drop the event (with or without logging) 2) sending to dead letter queue

Dropping the event without logging is dangerous, so my suggestion would be to send to DLQ if enabled, otherwise drop the event with logging.

andsel commented 1 week ago

I agree with you, logging and DLQ-ing if DLQ is enabled, is the best path forward.