Open guyboertje opened 8 years ago
I believe this issue is responsible for the numerous failures I'm getting when trying to parse my multiline log stream using stdin w/ the multiline codec. When using the multiline filter this behavior is not a factor, only when using multiline as a codec with the stdin input.
[amartinez@centos4 ocppserver-logstash]$ zcat ~/SessionData/ocpplogs/2016-03-13-ocppserver.log.gz | tail -n +130000 | head -65000 | sudo /opt/logstash/bin/logstash -f bugRepro.conf | grep "\"ocpp_logtime\" =>" "ocpp_logtime" => "[13 Mar 201\n6 04:16:10:353]" "ocpp_logtime" => "[13 Mar 2016 04:19:08:\n608]" "ocpp_logtime" => "[13 Mar 2016 04:\n19:52:908]" "ocpp_logtime" => "[13 Mar 2016 04:20:06\n:111]" "ocpp_logtime" => "[13 Mar 2016 04:20\n:15:217]" "ocpp_logtime" => "[13 Mar 2016 04:22:58:1\n47]" "ocpp_logtime" => "[13 Mar 2016 04:24:\n53:896]" "ocpp_logtime" => "[13 Mar 2016 0\n4:25:21:805]" "ocpp_logtime" => "[13 Mar 20\n16 04:27:05:976]" "ocpp_logtime" => "[13 Mar 20\n16 04:27:43:712]" "ocpp_logtime" => "[13 Mar 2016 04:29:44:464\n]" "ocpp_logtime" => "[13 Mar 2016 04:35:08:17\n2]"
Config @ https://github.com/anthonyjmartinez/ocppserver-logstash/tree/bug-testing
The next major release of LS will solve this. Your only option now is to use a file as a intermediate store after decompressing and chopping up the original.
Thanks for the information. How chopped up does the original need to be in order to avoid further issues? Does this also mean that using filebeat to handle the live stream from the source machine directly is not possible at this time?
By chopped up I refer to your tail -n +130000 | head -65000
bit. I meant that after you unzip the file and discard any lines you don't want and append the result to a new file - then LS will read the new file and handle multiline but you would need auto-flush.
Filebeat is preferable as it has a different implementation of multiline that does not have the line requirement built in. Filebeat tails files though, so try to have the multiline pattern match at the end of the block. Filebeat does not decompress files.
FYI (to future readers of this issue) We are planning a change to the way inputs use functions like line and multiline. This is scheduled to be in the next major release whenever that is.
So, was this ever resolved? I am still getting this issue. I'm using Logstash version 6.7.0.
@guyboertje Sorry to keep pestering you, but was there ever a resolution to this problem?
See https://github.com/logstash-plugins/logstash-codec-multiline/issues/14
Preliminary:
Fault:
\npiece_of_line_in_this_side_of_32K_block
is a full line and buffers it as such. The other piece of the line in the next 32K block is also treated as a lineProposal: