fluent / fluent-bit

Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX and Windows
https://fluentbit.io
Apache License 2.0
5.65k stars 1.54k forks source link

Sporadic SIGSEGV with SEGV_MAPERR code in mpack_parse_tag() at mpack.c:2630 #2794

Closed asdavidov closed 3 years ago

asdavidov commented 3 years ago

Bug Report

Describe the bug Fluent-bit sporadically SIGSEGV with SEGV_MAPPER code in mpack.c

To Reproduce

Expected behavior No SIGSEGV during run time. Screenshots none

Your Environment

Filters alter_size Alter incoming chunk size aws Add AWS Metadata record_modifier modify record throttle Throttle messages using sliding window algorithm kubernetes Filter to append Kubernetes metadata modify modify records by applying rules nest nest events by specified field values parser Parse events expect Validate expected keys and values grep grep events by specified field values rewrite_tag Rewrite records tags lua Lua Scripting Filter stdout Filter events to STDOUT

Outputs azure Send events to Azure HTTP Event Collector azure_blob Azure Blob Storage bigquery Send events to BigQuery via streaming insert counter Records counter datadog Send events to DataDog HTTP Event Collector es Elasticsearch exit Exit after a number of flushes (test purposes) file Generate log file forward Forward (Fluentd protocol) http HTTP Output influxdb InfluxDB Time Series logdna LogDNA loki Loki kafka-rest Kafka REST Proxy nats NATS Server nrlogs New Relic null Throws away events plot Generate data file for GNU Plot slack Send events to a Slack channel splunk Send events to Splunk HTTP Event Collector stackdriver Send events to Google Stackdriver Logging stdout Prints events to STDOUT syslog Syslog tcp TCP Output td Treasure Data flowcounter FlowCounter gelf GELF Output cloudwatch_logs Send logs to Amazon CloudWatch kinesis_firehose Send logs to Amazon Kinesis Firehose s3 Send to S3

Internal Event Loop = select Build Flags = FLB_HAVE_PARSER FLB_HAVE_RECORD_ACCESSOR FLB_HAVE_STREAM_PROCESSOR JSMN_PARENT_LINKS JSMN_STRICT FLB_HAVE_TLS FLB_HAVE_AWS FLB_HAVE_SIGNV4 FLB_HAVE_SQLDB FLB_HAVE_FORK FLB_HAVE_GMTOFF FLB_HAVE_UNIX_SOCKET FLB_HAVE_PROXY_GO FLB_HAVE_LIBBACKTRACE FLB_HAVE_REGEX FLB_HAVE_UTF8_ENCODER FLB_HAVE_LUAJIT FLB_HAVE_C_TLS FLB_HAVE_ACCEPT4


**Additional context**
I've got several core files and their content is almost identical.

The backtrace from the version built with debug symbols shows the following output:

(lldb) bt

Frame #11 variables were:

(flb_forward *) ctx = 0x0000000801c421c0
(flb_forward_config *) fc = 0x0000000801c8f000
(flb_forward_flush *) ff = 0x0000000801c731b0
(const char *) tag = 0x0000000801c34500 "edok-ext.backend"
(int) tag_len = 16
(const void *) data = 0x0000000800aeb028
(size_t) bytes = 144581
(void **) out_buf = 0x0000000801c83208
(size_t *) out_size = 0x0000000801c83200
(int) entries = 0
(char *) chunk = 0x0000000801c731b8 ""
(char [33]) chunk_buf = ""
(msgpack_packer) mp_pck = {
  data = 0x0000000801c83008
  callback = 0x00000000004b6940 (fluent-bit`msgpack_sbuffer_write at sbuffer.h:60)
}
(msgpack_sbuffer) mp_sbuf = (size = 0, data = 0x0000000000000000, alloc = 0)

Frame #10 variables:

(const void *) data = 0x0000000800aeb028
(size_t) bytes = 144581
(int) count = 1
(mpack_reader_t) reader = {
  context = 0x0000000000000000
  fill = 0x0000000000000000
  error_fn = 0x0000000000000000
  teardown = 0x0000000000000000
  skip = 0x0000000000000000
  buffer = 0x0000000000000000
  size = 0
  data = 0x0000000800aeb028 ""
  end = 0x0000000800b0e4ed ""
  error = mpack_ok
}

Frame #9 variables:

(mpack_reader_t *) reader = 0x0000000801c82f38
(mpack_tag_t) var = {
  type = mpack_type_missing
  exttype = '\0'
  v = (u = 34389634872, i = 34389634872, d = 1.6990737163279334E-313, f = 7.35361525E-38, b = true, l = 29896504, n = 29896504)
}

Frame #8 variables:

(mpack_reader_t *) reader = 0x0000000801c82f38
(mpack_tag_t) tag = {
  type = mpack_type_missing
  exttype = '\0'
  v = (u = 0, i = 0, d = 0, f = 0, b = false, l = 0, n = 0)
}
(size_t) count = 0

Frame #7 variables:

(mpack_reader_t *) reader = 0x0000000801c82f38
(mpack_tag_t *) tag = 0x0000000801c82eb8
(uint8_t) type = '\0'

Disassembled line of code at frame #7:

fluent-bit`mpack_parse_tag:
    0x530901 <+113>: callq  0x5291b0                  ; mpack_load_u8 at mpack.h:2028

Hope this will help to find the root cause of this problem. I can provide an additional output from debugger if it is needed.

asdavidov commented 3 years ago

FYI - disabling filesystem buffering (i.e. commenting out "storage.type filesystem" option in [INPUT] section) helps to work-around this issue.

github-actions[bot] commented 3 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] commented 3 years ago

This issue was closed because it has been stalled for 5 days with no activity.

JeffLuoo commented 3 years ago

bump, saw this with Loki plugin:

[2021/05/05 01:35:14] [ info] [output:loki:loki.0] loki.logging.svc:3100, HTTP status=204
[2021/05/05 01:35:15] [ info] [output:loki:loki.0] loki.logging.svc:3100, HTTP status=204
[2021/05/05 01:35:16] [engine] caught signal (SIGSEGV)
#0  0x561e046a1727      in  mpack_load_u8() at lib/mpack-amalgamation-1.0/src/mpack/mpack.h:2029
#1  0x561e046a7574      in  mpack_parse_tag() at lib/mpack-amalgamation-1.0/src/mpack/mpack.c:2630
#2  0x561e046a80a5      in  mpack_read_tag() at lib/mpack-amalgamation-1.0/src/mpack/mpack.c:2979
#3  0x561e046a8166      in  mpack_discard() at lib/mpack-amalgamation-1.0/src/mpack/mpack.c:3028
#4  0x561e045d6f3d      in  flb_mp_count() at src/flb_mp.c:44
#5  0x561e04636632      in  loki_compose_payload() at plugins/out_loki/loki.c:954
#6  0x561e046369d9      in  cb_loki_flush() at plugins/out_loki/loki.c:1062
#7  0x561e045af4da      in  output_pre_cb_flush() at include/fluent-bit/flb_output.h:466
#8  0x561e04a4e526      in  co_init() at lib/monkey/deps/flb_libco/amd64.c:117
#9  0xffffffffffffffff  in  ???() at ???:0

Version v1.7.3

chetanv-oi commented 2 years ago

Another one:

{"log":"[2022/02/17 00:42:50] [engine] caught signal (SIGSEGV)\n","stream":"stderr","time":"2022-02-17T00:42:50.500981706Z"}
{"log":"#0  0x562629754281      in  mpack_load_u8() at lib/mpack-amalgamation-1.0/src/mpack/mpack.h:2029\n","stream":"stderr","time":"2022-02-17T00:42:50.526985167Z"}
{"log":"#1  0x56262975a0ce      in  mpack_parse_tag() at lib/mpack-amalgamation-1.0/src/mpack/mpack.c:2630\n","stream":"stderr","time":"2022-02-17T00:42:50.527018029Z"}
{"log":"#2  0x56262975abff      in  mpack_read_tag() at lib/mpack-amalgamation-1.0/src/mpack/mpack.c:2979\n","stream":"stderr","time":"2022-02-17T00:42:50.527027388Z"}
{"log":"#3  0x56262975acc0      in  mpack_discard() at lib/mpack-amalgamation-1.0/src/mpack/mpack.c:3028\n","stream":"stderr","time":"2022-02-17T00:42:50.527035311Z"}
{"log":"#4  0x562629693615      in  flb_mp_count() at src/flb_mp.c:40\n","stream":"stderr","time":"2022-02-17T00:42:50.527044261Z"}
{"log":"#5  0x5626296e7a62      in  flush_forward_mode() at plugins/out_forward/forward.c:999\n","stream":"stderr","time":"2022-02-17T00:42:50.527249043Z"}
{"log":"#6  0x5626296e840f      in  cb_forward_flush() at plugins/out_forward/forward.c:1216\n","stream":"stderr","time":"2022-02-17T00:42:50.527258158Z"}
{"log":"#7  0x56262967e64c      in  output_pre_cb_flush() at include/fluent-bit/flb_output.h:472\n","stream":"stderr","time":"2022-02-17T00:42:50.527373361Z"}
{"log":"#8  0x562629ad2ec6      in  co_init() at lib/monkey/deps/flb_libco/amd64.c:117\n","stream":"stderr","time":"2022-02-17T00:42:50.527392048Z"}

Version 1.6.6