elastic / dissect-specification

specification for the dissect parser
Apache License 2.0
2 stars 2 forks source link

Optional key ? #8

Open jakelandis opened 6 years ago

jakelandis commented 6 years ago

The current behavior is all or nothing w/r/t to matching keys. Either all the keys match or the processing fails.

All delimiters must be present in string, and all keys must have a corresponding value.

Some log files have fixed delimiters, but between two delimiters may or may not have values. For example

1998-08-29 [] INFO

or

1998-08-29 [foo] INFO

With the currently specification it is required to define two patterns to match both.

%{date} [] %{level}

and

%{date} [%{extra}] %{level}

Should the specification include an optional modifier to allow the match to succeed with only 1 pattern ?

For example, we could use the ? on the right side to indicate optional.

Possibly

%{date} [%{extra?}] %{level}

The spec would change to something like:

All delimiters must be present in string, and all non-optional keys must have a corresponding value.

thoughts?

@felixbarny @ph @guyboertje

jakelandis commented 6 years ago

...or we make all keys optional and make the user define the required keys ?

Maybe use the !

%{!date} [%{extra}] %{!level}

That would be a pretty big change to the semantics since it would now allow partial matches by default ... meaning that it won't fail as often, but could fail more silently. Basically you have to opt-in to strict behavior vs. opt-out of strict behavior.

guyboertje commented 5 years ago

In the LS implementation, the pattern should not fail, extra would be an empty string. For me, I'd want to look at the spec where it defines all keys must have a value as arguably an empty string is a value. If a field is made optional, as in it is not added if its value is empty, then we need to discuss the follow up action required by the user. What mechanisms would they use to ensure that a default value is added and do the three processing pipelines have the same conditional capabilities to test for missing vs empty string in order to add a default? When querying ES, is it better to have no field or a field with an empty string?