elastic / dissect-specification

specification for the dissect parser
Apache License 2.0
2 stars 2 forks source link

Should the right padding modifer also trim whitespace ? #7

Open jakelandis opened 6 years ago

jakelandis commented 6 years ago

Perhaps this is an implementation detail of the ingest node processor.. but there is a deceptive use case w/r/t to the right padding modifier.

This is documented and works as expected:

pattern: %{a->} %{b}
string: foo         bar
result: a='foo', b='bar'

However, this unexpected:

pattern: %{a->}: %{b}
string: foo        : bar
result: a='foo        ', b='bar'

Notice how a carries forward the empty spaces. It actually has nothing to do with the right padding modifier -> , and the same result is achieved without it:

pattern: %{a}: %{b}
string: foo        : bar
result: a='foo        ', b='bar'

For %{a->}: example, it is surprising that the -> does not skip the whitespace.

I propose that the dissect spec should require implementations to right trim any whitespace (space, new line, and tabs) for keys that define the -> modifier.

thoughts?

cc: @felixbarny @ph @guyboertje

guyboertje commented 5 years ago

In my opinion, we should offer a trim_whitespace setting that trims all field values plucked from the source. You describe an extreme case of:

pattern: %{a}-%{b}
string: foo - bar
result: a='foo ', b=' bar'
ph commented 5 years ago

@guyboertje _trimwhitespace would trim left/right side of values?

" ok " => "ok"

guyboertje commented 5 years ago

Both sides I think.