logstash-plugins / logstash-codec-multiline

Apache License 2.0
7 stars 31 forks source link

multiline (or another) should support header/footer delimited messages #2

Closed untergeek closed 8 years ago

untergeek commented 9 years ago

Migrated from: https://logstash.jira.com/browse/LOGSTASH-190

15:28 < jasonamster> was thinking more of a multiline pattern... what if the separator was like this:
15:28 < jasonamster> -----------END----------
15:28 < jasonamster> ----------BEGIN---------
guyboertje commented 8 years ago

I was thinking that we need to introduce a simple state machine that have two states pass-thru and accumulating. When the BEGIN pattern is seen it moves to the accumulating state and when the END pattern is seen, it moves to the pass-thru state. So a transition of {any} -> pass_thru will generate an event from the buffer. Any regular lines will change the state from pass_thru to pass_thru - a {any} -> pass_thru transition and event generation.

If a second BEGIN pattern is seen before an END it would be ignored, as too a second END before a BEGIN.

BEGIN END
while not BEGIN, it transitions `pass_thru -> pass_thru` flushing the previous lines and buffers the line
when BEGIN, it transitions `pass_thru -> accumulating` flushing the previous lines then buffers the line
while not END, it transitions `accumulating -> accumulating` and buffers the line
when END, it transitions `accumulating -> pass_thru` buffers the line then flushes

For the existing behaviour:

4 possibilities exist.
negate: true,  what: previous
while the pattern matches, it transitions `pass_thru -> pass_thru` flushing the previous lines and buffers the line
when the first pattern does not match, it transitions `pass_thru -> accumulating` and buffers the line
while the pattern does not match, it transitions `accumulating -> accumulating` and buffers the line
when the pattern matches, it transitions `accumulating -> pass_thru` flushing the previous lines and buffers the line

negate: true,  what: next
while the pattern matches, it transitions `pass_thru -> pass_thru` flushing the previous lines and buffers the line
when the first pattern does not match, it transitions `pass_thru -> accumulating` flushing the previous lines and buffers the line
while the pattern does not match, it transitions `accumulating -> accumulating` and buffers the line
when the pattern matches, it transitions `accumulating -> pass_thru` and buffers the line

negate: false, what: previous
while the pattern does not match, it transitions `pass_thru -> pass_thru` flushing the previous lines and buffers the line
when the first pattern matches, it transitions `pass_thru -> accumulating` and buffers the line
while the pattern matches, it transitions `accumulating -> accumulating` and buffers the line
when the first pattern does not match, it transitions `accumulating -> pass_thru` flushing the previous lines and buffers the line

negate: false, what: next
while the pattern does not match, it transitions `pass_thru -> pass_thru` flushing the previous lines and buffers the line
when the first pattern matches, it transitions `pass_thru -> accumulating` flushing the previous lines and buffers the line
while the pattern matches, it transitions `accumulating -> accumulating` and buffers the line
when the first pattern does not match, it transitions `accumulating -> pass_thru` and buffers the line

As can be seen from the above there are small differences between the BEGIN END and existing behaviour. BEGIN END is basically a next workflow with an extra flush on the last step.

Therefore to look at the state machine generically:

If another BEGIN pattern is seen before an END it would be ignored.

I have written a POC. I will discuss with the team.

guyboertje commented 8 years ago

This will be supported in 5.0.0 by the use of Event Mills. At that time the Multiline codec will be redundant.

guyboertje commented 8 years ago

Closing, this will not be implemented in this plugin.