Open colinsurprenant opened 4 years ago
Is there any update on when this may be corrected? It has been reported in the 7.9 version as well.
@jsvd WDYT?
In that same logic I suggest we always return false for <, <=, >, >= operators against an inexistent field. This behaviour could be controlled with a flag but when disabled I suggest we raise a ConditionalException we could trap higher up and tag the event and/or log the conditional failure and move on.
Conditional expressions such as missing_field != 0
or missing_field not in [tags]
are useful because you can rely on them make sense even in the absence of the fields. For example:
if [deployment_environment] == "prod" {
elasticsearch { }
}
In this case we're confident about the data being sent to production: we only write to ES if the field is there and has the expected value. If the value is different, or nonexistent, then the conditional is false.
While in the case of greater/less than operators, we can make bad judgements by defaulting to false:
# log warn(4) info(3) debug (2) and trace(1) to file
if [log_level] <= 4 {
file { path => /tmp/log }
} else { # send error(5) to production
pagerduty {}
}
In this case if the field is missing we hit the else
clause and flood pagerduty with potential garbage.
Avoid performing unguarded comparisons by checking to ensure a value exists before comparing it. Instead of
if [somefield] > 1 { ... }
, doif [somefield] and [somefield] > 1 { ... }
An added note is that if you're writing if [somefield] > 1 { } else { }
, then the safe workaround needs to be:
if [somefield] {
if [somefield] > 1 {
...
} else {
...
}
}
Until we can ship data that can't be evaluated correctly to the DLQ, I wonder if we can find other user experiences instead of evaluating the conditional to false. One alternative would be to skip that conditional altogether, adding the event instead to the batch that exits the conditional, with an extra tag or metadata added to it. Any other ideas?
@jsvd I'm replying here as I do agree @colinsurprenant has the stronger argument: crashing the application for a single event is the worst case scenario.
Alternatives are:
The motto for the community is: If a newbie has a bad time, it's a bug.
. This makes it easy to make a decision.
Having just hit this bug by crashing our staging logstash, as a newbie logging the missing field would be much appreciated 🙏 Even just logging that this error was the cause and crashing would be a substantial improvement.
We also hit this problem and found it rather disturbing that such a situation triggers a full crash of the Logstash instance, leading to the log line going missing and Logstash potentially being down for some time during restarts. In addition, the exception message reveals nothing about the cause.
if [log_level] <= 4 { file { path => /tmp/log } } else { # send error(5) to production pagerduty {} }
IMO, evaluating such a comparison to false while also skipping all else (and else-if) blocks and adding a special tag is the best approach to the situation. However, I find it difficult to argue that this is definitely more clear than evaluating the comparison to false. And there may be a bigger chunk of implementation work required, as skipping conditionals is not how if-else-if-else blocks usually work.
The
<
,<=
,>
,>=
inequality operators in conditionals hard-crash the pipeline with aNullPointerException
when the field is inexistant.The
==
,!=
,=~
,!~
,in
andnot in
do not crash with a NPE when the field is inexistant.This is a reboot of #10706
Reproduction
Proposal
IMO simply crashing the pipeline is the worst case scenario and is not acceptable. We already support that a
[inexistent field] == value
is alwaysfalse
and[inexistent field] != value
is always true and by that we have accepted to give meaning to an inexistent field.In that same logic I suggest we always return
false
for<
,<=
,>
,>=
operators against an inexistent field. This behaviour could be controlled with a flag but when disabled I suggest we raise aConditionalException
we could trap higher up and tag the event and/or log the conditional failure and move on.Workaround
Avoid performing unguarded comparisons by checking to ensure a value exists before comparing it.
Instead of
if [somefield] > 1 { ... }
, doif [somefield] and [somefield] > 1 { ... }