Closed jiscfoo closed 3 years ago
Possibly related to #3877
gdb
provides a more detailed stack trace:
#0 __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:532
#1 0x00005616c2b0a30d in msgpack_sbuffer_write (data=0x7fb16202e388, buf=0x7fb16206ce25 "",
len=18446744073709551615) at /tmp/fluent-bit/lib/msgpack-c/include/msgpack/sbuffer.h:88
#2 0x00005616c2b0a0df in msgpack_pack_str_body (x=0x7fb16202e3a0, b=0x7fb16206ce25, l=18446744073709551615)
at /tmp/fluent-bit/lib/msgpack-c/include/msgpack/pack_template.h:784
#3 0x00005616c2b0ad0d in flb_tail_mult_append_raw (buf=0x7fb16206ce25 "", size=-1, file=0x7fb16202e300,
config=0x7fb1620411c0) at /tmp/fluent-bit/plugins/in_tail/tail_multiline.c:238
#4 0x00005616c2b0b006 in flb_tail_mult_process_content (now=1628761684, buf=0x7fb16206ce25 "",
len=18446744073709551615, file=0x7fb16202e300, ctx=0x7fb1620411c0, processed_bytes=25893)
at /tmp/fluent-bit/plugins/in_tail/tail_multiline.c:348
#5 0x00005616c2b0d55f in process_content (file=0x7fb16202e300, bytes=0x7fb162df4778)
at /tmp/fluent-bit/plugins/in_tail/tail_file.c:427
#6 0x00005616c2b0f765 in flb_tail_file_chunk (file=0x7fb16202e300)
at /tmp/fluent-bit/plugins/in_tail/tail_file.c:1182
#7 0x00005616c2b076c4 in in_tail_collect_static (ins=0x7fb163248f00, config=0x7fb163232840,
in_context=0x7fb1620411c0) at /tmp/fluent-bit/plugins/in_tail/tail.c:140
#8 0x00005616c2ab3c99 in flb_input_collector_fd (fd=19, config=0x7fb163232840)
at /tmp/fluent-bit/src/flb_input.c:1075
#9 0x00005616c2ac496a in flb_engine_handle_event (config=0x7fb163232840, mask=1, fd=19)
at /tmp/fluent-bit/src/flb_engine.c:377
#10 flb_engine_start (config=0x7fb163232840) at /tmp/fluent-bit/src/flb_engine.c:639
#11 0x00005616c2aa950d in flb_lib_worker (data=0x7fb16320b340) at /tmp/fluent-bit/src/flb_lib.c:628
#12 0x00007fb164d0ffa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#13 0x00007fb1643f04cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Of note in there is the len=18446744073709551615
which ends up being passed into sbuffer.h:88
:
memcpy(sbuf->data + sbuf->size, buf, len);
Looking at frame 5 in the stack (process_content (file=0x7fa52b62e300, bytes=0x7fa52c3f4778) at /tmp/fluent-bit/plugins/in_tail/tail_file.c:427
)
line_len
is passed into flb_tail_mult_process_content()
which is calculated at https://github.com/fluent/fluent-bit/blob/v1.8.3/plugins/in_tail/tail_file.c#L370
line_len = len - crlf;
Within gdb, at point of crash, len
is 0 (zero) and crlf
is 1, so 0 - 1
is a very big number which isn't being handled correctly.
If I enable skip_empty_lines
then it does not seem to crash.
Possibly more subtle than this… the multiline input it chokes on contains ^M
characters. I've isolated down to (cat -v output, so ^Ms are inline):
2021-08-10 09:12:14 DEBUG XMLTooling.SOAPTransport.CURL [252] [default]: nothing free in pool, returning new connection handle
2021-08-10 09:12:14 DEBUG XMLTooling.SOAPTransport.CURL [252] [default]: sending SOAP message to http://mdq.ukfederation.org.uk/entities/https%3A%2F%2Fredacted%2Fidp%2Fshibboleth
2021-08-10 09:12:14 DEBUG XMLTooling.libcurl [252] [default]: Trying 52.169.71.108:80...
2021-08-10 09:12:14 DEBUG XMLTooling.libcurl [252] [default]: Connected to mdq.ukfederation.org.uk (52.169.71.108) port 80 (#0)
2021-08-10 09:12:14 DEBUG XMLTooling.libcurl [252] [default]: GET /entities/https%3A%2F%2Fredacted%2Fidp%2Fshibboleth HTTP/1.1^M
Host: mdq.ukfederation.org.uk^M
User-Agent: shibboleth/3.2.3 OpenSAML/3.2.0 XMLTooling/3.2.0 XML-Security-C/2.0.2 Xerces-C/3.2.3 log4shib/2.0.0 libcurl/7.77.0 OpenSSL/1.0.2k-fips^M
Accept-Encoding: deflate, gzip^M
Content-Type: text/xml^M
Accept: application/samlmetadata+xml^M
Xerces-C: 3.2.3^M
XML-Security-C: 2.0.2^M
XMLTooling-C: 3.2.0^M
OpenSAML-C: 3.2.0^M
shibboleth: 3.2.3^M
^M
2021-08-10 09:12:14 DEBUG XMLTooling.libcurl [252] [default]: Mark bundle as not supporting multiuse
Raw file and base64 encoded file attached.
@jiscfoo Thank you for information!
The log ^M+blank line causes SIGSEGV. I think multi-line is not related this issue.
bbb^M
I sent a patch #3951.
Bug Report
Describe the bug
fluent-bit (1.8.3) crashes in
msgpack_sbuffer_write()
during a multi-line tailTo Reproduce
Parser:
Config:
Invoke as:
Expected behavior
Correct ingestion of log events and output to stdout
Screenshots
Your Environment
gdb
)in_tail
,out_stdout
and one bespoke parser (above)Additional context
Segfault stops log processing. Only way to proceed is to bin the
pos
database we'd been using to skip over the input it's choking on and restart the service.Input file that causes the crash can be supplied to developers but not made public.