fluent / fluent-bit

Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX and Windows
https://fluentbit.io
Apache License 2.0
5.7k stars 1.55k forks source link

multiline parser doens't play well with the last record #8623

Open pmeier opened 5 months ago

pmeier commented 5 months ago

Bug Report

When using a multiline parser filter, the last record is

Files

fluent-bit.conf

[SERVICE]
    flush         1
    log_level     debug
    parsers_file  parsers.conf

[INPUT]
    Name            tail
    Path            input.log
    Read_from_Head  True
    Exit_On_Eof     True
    Parser          docker_no_time

[FILTER]
    Name                   multiline
    Match                  *
    multiline.parser       multiline_iso8601
    multiline.key_content  log

[OUTPUT]
    Name    stdout
    Match   *
    Format  json_lines

parsers.conf

[PARSER]
    Name         docker_no_time
    Format       json
    Time_Keep    off
    Time_Key     time
    Time_Format  %Y-%m-%dT%H:%M:%S.%LZ

# The multiline parsers below all follow the same matching scheme given a
# timestamp regex denoted as <regex> below:
# 1. We are looking for the timestamp anywhere in the message with
#    "/.*?<regex>.*/"
# 2. We keep appending new records as long as we don't find another timestamp with
#    "/^(?!<regex>).*$/"
#    or the flush_timeout runs out.

[MULTILINE_PARSER]
    name           multiline_iso8601
    type           regex
    flush_timeout  1000
    # regex \d{4}-\d{2}-\d{2}[tT]\d{2}:\d{2}:\d{2}(\.\d+)?
    # rules |   state name  | regex pattern                                              | next state
    # ------|---------------|------------------------------------------------------------|-----------
    rule      "start_state"   "/.*?\d{4}-\d{2}-\d{2}[tT]\d{2}:\d{2}:\d{2}(\.\d+)?.*/"      "cont"
    rule      "cont"          "/^(?!\d{4}-\d{2}-\d{2}[tT]\d{2}:\d{2}:\d{2}(\.\d+)?).*$/"   "cont"

input.log

{"log":"2024-03-22T11:22:23.762806 1cd3e764-4fe7-4362-81cf-e25a06bd8e65\n","stream":"stdout","time":"2024-03-22T11:22:25.365469022Z"}
{"log":"2024-03-22T11:22:23.962954 d07b4833-e5ef-4a12-a316-216f3bf40471\n","stream":"stdout","time":"2024-03-22T11:22:25.36547306Z"}
{"log":"Traceback (most recent call last):\n","stream":"stdout","time":"2024-03-22T11:22:25.365477148Z"}
{"log":"  File \"/tmp/script.py\", line 12, in \u003cmodule\u003e\n","stream":"stdout","time":"2024-03-22T11:22:25.365481105Z"}
{"log":"    raise RuntimeError(f\"Oops! {uuid.uuid4()}\")\n","stream":"stdout","time":"2024-03-22T11:22:25.365485213Z"}
{"log":"RuntimeError: Oops! a84e2b0e-fb1b-4483-93bd-29cb62e4c81d\n","stream":"stdout","time":"2024-03-22T11:22:25.36548923Z"}
{"log":"\n","stream":"stdout","time":"2024-03-22T11:22:25.365493378Z"}
{"log":"2024-03-22T11:22:24.363447 ade91ead-9038-4508-a2d3-a572fae1492a\n","stream":"stdout","time":"2024-03-22T11:22:25.365497256Z"}
{"log":"2024-03-22T11:22:24.563603 237758d3-450b-42c8-84d0-f67af080a4ef\n","stream":"stdout","time":"2024-03-22T11:22:25.365501343Z"}

Test Cases

No modification of the files above

{"date":1711106545.365469,"log":"2024-03-22T11:22:23.762806 1cd3e764-4fe7-4362-81cf-e25a06bd8e65\n","stream":"stdout"}
{"date":1711106545.365473,"log":"2024-03-22T11:22:23.962954 d07b4833-e5ef-4a12-a316-216f3bf40471\nTraceback (most recent call last):\n  File \"/tmp/script.py\", line 12, in <module>\n    raise RuntimeError(f\"Oops! {uuid.uuid4()}\")\nRuntimeError: Oops! a84e2b0e-fb1b-4483-93bd-29cb62e4c81d\n\n","stream":"stdout"}
{"date":1711106545.365497,"log":"2024-03-22T11:22:24.363447 ade91ead-9038-4508-a2d3-a572fae1492a\n","stream":"stdout"}
Full debug output

``` [2024/03/22 15:21:26] [ info] Configuration: [2024/03/22 15:21:26] [ info] flush time | 1.000000 seconds [2024/03/22 15:21:26] [ info] grace | 5 seconds [2024/03/22 15:21:26] [ info] daemon | 0 [2024/03/22 15:21:26] [ info] ___________ [2024/03/22 15:21:26] [ info] inputs: [2024/03/22 15:21:26] [ info] tail [2024/03/22 15:21:26] [ info] ___________ [2024/03/22 15:21:26] [ info] filters: [2024/03/22 15:21:26] [ info] multiline.0 [2024/03/22 15:21:26] [ info] ___________ [2024/03/22 15:21:26] [ info] outputs: [2024/03/22 15:21:26] [ info] stdout.0 [2024/03/22 15:21:26] [ info] ___________ [2024/03/22 15:21:26] [ info] collectors: [2024/03/22 15:21:26] [ info] [fluent bit] version=2.2.2, commit=8acce5163a, pid=242693 [2024/03/22 15:21:26] [debug] [engine] coroutine stack size: 24576 bytes (24.0K) [2024/03/22 15:21:26] [ info] [storage] ver=1.5.1, type=memory, sync=normal, checksum=off, max_chunks_up=128 [2024/03/22 15:21:26] [ info] [cmetrics] version=0.6.6 [2024/03/22 15:21:26] [ info] [ctraces ] version=0.4.0 [2024/03/22 15:21:26] [ info] [input:tail:tail.0] initializing [2024/03/22 15:21:26] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only) [2024/03/22 15:21:26] [debug] [tail:tail.0] created event channels: read=21 write=22 [2024/03/22 15:21:26] [debug] [input:tail:tail.0] flb_tail_fs_inotify_init() initializing inotify tail input [2024/03/22 15:21:26] [debug] [input:tail:tail.0] inotify watch fd=27 [2024/03/22 15:21:26] [debug] [input:tail:tail.0] scanning path input.log [2024/03/22 15:21:26] [debug] [input:tail:tail.0] inode=433868 with offset=0 appended as input.log [2024/03/22 15:21:26] [debug] [input:tail:tail.0] scan_glob add(): input.log, inode 433868 [2024/03/22 15:21:26] [debug] [input:tail:tail.0] 1 new files found on path 'input.log' [2024/03/22 15:21:26] [ info] [filter:multiline:multiline.0] created emitter: emitter_for_multiline.0 [2024/03/22 15:21:26] [ info] [input:emitter:emitter_for_multiline.0] initializing [2024/03/22 15:21:26] [ info] [input:emitter:emitter_for_multiline.0] storage_strategy='memory' (memory only) [2024/03/22 15:21:26] [debug] [emitter:emitter_for_multiline.0] created event channels: read=29 write=30 [2024/03/22 15:21:26] [debug] [stdout:stdout.0] created event channels: read=32 write=33 [2024/03/22 15:21:26] [debug] [router] match rule tail.0:stdout.0 [2024/03/22 15:21:26] [debug] [router] match rule emitter.1:stdout.0 [2024/03/22 15:21:26] [ info] [sp] stream processor started [2024/03/22 15:21:26] [ info] [output:stdout:stdout.0] worker #0 started [2024/03/22 15:21:26] [ info] [filter:multiline:multiline.0] created new multiline stream for tail.0_tail.0 [2024/03/22 15:21:26] [debug] [filter:multiline:multiline.0] Created new ML stream for tail.0_tail.0 [2024/03/22 15:21:26] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0 [2024/03/22 15:21:26] [debug] [input chunk] update output instances with new chunk size diff=297, records=1, input=emitter_for_multiline.0 [2024/03/22 15:21:26] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0 [2024/03/22 15:21:26] [debug] [input:tail:tail.0] [static files] processed 1.1K [2024/03/22 15:21:26] [ info] [input:tail:tail.0] inode=433868 file=input.log ended, stop [2024/03/22 15:21:26] [debug] [input:tail:tail.0] inode=433868 file=input.log promote to TAIL_EVENT [2024/03/22 15:21:26] [ info] [input:tail:tail.0] inotify_fs_add(): inode=433868 watch_fd=1 name=input.log [2024/03/22 15:21:26] [debug] [input:tail:tail.0] [static files] processed 0b, done [2024/03/22 15:21:26] [debug] [task] created task=0x7a9aa4047810 id=0 OK [2024/03/22 15:21:26] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0 [2024/03/22 15:21:26] [ warn] [engine] service will shutdown in max 5 seconds [2024/03/22 15:21:26] [ info] [input] pausing tail.0 {"date":1711106545.365469,"log":"2024-03-22T11:22:23.762806 1cd3e764-4fe7-4362-81cf-e25a06bd8e65\n","stream":"stdout"} {"date":1711106545.365473,"log":"2024-03-22T11:22:23.962954 d07b4833-e5ef-4a12-a316-216f3bf40471\nTraceback (most recent call last):\n File \"/tmp/script.py\", line 12, in \n raise RuntimeError(f\"Oops! {uuid.uuid4()}\")\nRuntimeError: Oops! a84e2b0e-fb1b-4483-93bd-29cb62e4c81d\n\n","stream":"stdout"} {"date":1711106545.365497,"log":"2024-03-22T11:22:24.363447 ade91ead-9038-4508-a2d3-a572fae1492a\n","stream":"stdout"} [2024/03/22 15:21:26] [debug] [out flush] cb_destroy coro_id=0 [2024/03/22 15:21:26] [debug] [task] destroy task=0x7a9aa4047810 (task_id=0) [2024/03/22 15:21:27] [ info] [engine] service has stopped (0 pending tasks) [2024/03/22 15:21:27] [ info] [input] pausing tail.0 [2024/03/22 15:21:27] [ info] [output:stdout:stdout.0] thread worker #0 stopping... [2024/03/22 15:21:27] [ info] [output:stdout:stdout.0] thread worker #0 stopped [2024/03/22 15:21:27] [debug] [input:tail:tail.0] inode=433868 removing file name input.log [2024/03/22 15:21:27] [ info] [input:tail:tail.0] inotify_fs_remove(): inode=433868 watch_fd=1 ```

Exit_On_Eof False in fluent-bit.conf

{"date":1711106545.365469,"log":"2024-03-22T11:22:23.762806 1cd3e764-4fe7-4362-81cf-e25a06bd8e65\n","stream":"stdout"}
{"date":1711106545.365473,"log":"2024-03-22T11:22:23.962954 d07b4833-e5ef-4a12-a316-216f3bf40471\nTraceback (most recent call last):\n  File \"/tmp/script.py\", line 12, in <module>\n    raise RuntimeError(f\"Oops! {uuid.uuid4()}\")\nRuntimeError: Oops! a84e2b0e-fb1b-4483-93bd-29cb62e4c81d\n\n","stream":"stdout"}
{"date":1711106545.365497,"log":"2024-03-22T11:22:24.363447 ade91ead-9038-4508-a2d3-a572fae1492a\n","stream":"stdout"}
[2024/03/22 15:22:23] [debug] [out flush] cb_destroy coro_id=0
[2024/03/22 15:22:23] [debug] [task] destroy task=0x73ded4047810 (task_id=0)
[2024/03/22 15:22:26] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0
[2024/03/22 15:22:26] [debug] [task] created task=0x73ded4047a80 id=0 OK
[2024/03/22 15:22:26] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
{"date":1711106545.365501,"log":"2024-03-22T11:22:24.563603 237758d3-450b-42c8-84d0-f67af080a4ef\n","stream":"stdout"}
Full debug output

``` [2024/03/22 15:22:23] [ info] Configuration: [2024/03/22 15:22:23] [ info] flush time | 1.000000 seconds [2024/03/22 15:22:23] [ info] grace | 5 seconds [2024/03/22 15:22:23] [ info] daemon | 0 [2024/03/22 15:22:23] [ info] ___________ [2024/03/22 15:22:23] [ info] inputs: [2024/03/22 15:22:23] [ info] tail [2024/03/22 15:22:23] [ info] ___________ [2024/03/22 15:22:23] [ info] filters: [2024/03/22 15:22:23] [ info] multiline.0 [2024/03/22 15:22:23] [ info] ___________ [2024/03/22 15:22:23] [ info] outputs: [2024/03/22 15:22:23] [ info] stdout.0 [2024/03/22 15:22:23] [ info] ___________ [2024/03/22 15:22:23] [ info] collectors: [2024/03/22 15:22:23] [ info] [fluent bit] version=2.2.2, commit=8acce5163a, pid=243614 [2024/03/22 15:22:23] [debug] [engine] coroutine stack size: 24576 bytes (24.0K) [2024/03/22 15:22:23] [ info] [storage] ver=1.5.1, type=memory, sync=normal, checksum=off, max_chunks_up=128 [2024/03/22 15:22:23] [ info] [cmetrics] version=0.6.6 [2024/03/22 15:22:23] [ info] [ctraces ] version=0.4.0 [2024/03/22 15:22:23] [ info] [input:tail:tail.0] initializing [2024/03/22 15:22:23] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only) [2024/03/22 15:22:23] [debug] [tail:tail.0] created event channels: read=21 write=22 [2024/03/22 15:22:23] [debug] [input:tail:tail.0] flb_tail_fs_inotify_init() initializing inotify tail input [2024/03/22 15:22:23] [debug] [input:tail:tail.0] inotify watch fd=27 [2024/03/22 15:22:23] [debug] [input:tail:tail.0] scanning path input.log [2024/03/22 15:22:23] [debug] [input:tail:tail.0] inode=433868 with offset=0 appended as input.log [2024/03/22 15:22:23] [debug] [input:tail:tail.0] scan_glob add(): input.log, inode 433868 [2024/03/22 15:22:23] [debug] [input:tail:tail.0] 1 new files found on path 'input.log' [2024/03/22 15:22:23] [ info] [filter:multiline:multiline.0] created emitter: emitter_for_multiline.0 [2024/03/22 15:22:23] [ info] [input:emitter:emitter_for_multiline.0] initializing [2024/03/22 15:22:23] [ info] [input:emitter:emitter_for_multiline.0] storage_strategy='memory' (memory only) [2024/03/22 15:22:23] [debug] [emitter:emitter_for_multiline.0] created event channels: read=29 write=30 [2024/03/22 15:22:23] [debug] [stdout:stdout.0] created event channels: read=32 write=33 [2024/03/22 15:22:23] [debug] [router] match rule tail.0:stdout.0 [2024/03/22 15:22:23] [debug] [router] match rule emitter.1:stdout.0 [2024/03/22 15:22:23] [ info] [sp] stream processor started [2024/03/22 15:22:23] [ info] [output:stdout:stdout.0] worker #0 started [2024/03/22 15:22:23] [ info] [filter:multiline:multiline.0] created new multiline stream for tail.0_tail.0 [2024/03/22 15:22:23] [debug] [filter:multiline:multiline.0] Created new ML stream for tail.0_tail.0 [2024/03/22 15:22:23] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0 [2024/03/22 15:22:23] [debug] [input chunk] update output instances with new chunk size diff=297, records=1, input=emitter_for_multiline.0 [2024/03/22 15:22:23] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0 [2024/03/22 15:22:23] [debug] [input:tail:tail.0] [static files] processed 1.1K [2024/03/22 15:22:23] [debug] [input:tail:tail.0] inode=433868 file=input.log promote to TAIL_EVENT [2024/03/22 15:22:23] [ info] [input:tail:tail.0] inotify_fs_add(): inode=433868 watch_fd=1 name=input.log [2024/03/22 15:22:23] [debug] [input:tail:tail.0] [static files] processed 0b, done [2024/03/22 15:22:23] [debug] [task] created task=0x73ded4047810 id=0 OK [2024/03/22 15:22:23] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0 {"date":1711106545.365469,"log":"2024-03-22T11:22:23.762806 1cd3e764-4fe7-4362-81cf-e25a06bd8e65\n","stream":"stdout"} {"date":1711106545.365473,"log":"2024-03-22T11:22:23.962954 d07b4833-e5ef-4a12-a316-216f3bf40471\nTraceback (most recent call last):\n File \"/tmp/script.py\", line 12, in \n raise RuntimeError(f\"Oops! {uuid.uuid4()}\")\nRuntimeError: Oops! a84e2b0e-fb1b-4483-93bd-29cb62e4c81d\n\n","stream":"stdout"} {"date":1711106545.365497,"log":"2024-03-22T11:22:24.363447 ade91ead-9038-4508-a2d3-a572fae1492a\n","stream":"stdout"} [2024/03/22 15:22:23] [debug] [out flush] cb_destroy coro_id=0 [2024/03/22 15:22:23] [debug] [task] destroy task=0x73ded4047810 (task_id=0) [2024/03/22 15:22:26] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0 [2024/03/22 15:22:26] [debug] [task] created task=0x73ded4047a80 id=0 OK [2024/03/22 15:22:26] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0 {"date":1711106545.365501,"log":"2024-03-22T11:22:24.563603 237758d3-450b-42c8-84d0-f67af080a4ef\n","stream":"stdout"} [2024/03/22 15:22:26] [debug] [out flush] cb_destroy coro_id=1 [2024/03/22 15:22:26] [debug] [task] destroy task=0x73ded4047a80 (task_id=0) ^C[2024/03/22 15:22:36] [engine] caught signal (SIGINT) [2024/03/22 15:22:36] [ warn] [engine] service will shutdown in max 5 seconds [2024/03/22 15:22:36] [ info] [input] pausing tail.0 [2024/03/22 15:22:37] [ info] [engine] service has stopped (0 pending tasks) [2024/03/22 15:22:37] [ info] [input] pausing tail.0 [2024/03/22 15:22:37] [ info] [output:stdout:stdout.0] thread worker #0 stopping... [2024/03/22 15:22:37] [ info] [output:stdout:stdout.0] thread worker #0 stopped [2024/03/22 15:22:37] [debug] [input:tail:tail.0] inode=433868 removing file name input.log [2024/03/22 15:22:37] [ info] [input:tail:tail.0] inotify_fs_remove(): inode=433868 watch_fd=1 ```

Duplicate all lines in input.log

{"date":1711106545.365469,"log":"2024-03-22T11:22:23.762806 1cd3e764-4fe7-4362-81cf-e25a06bd8e65\n","stream":"stdout"}
{"date":1711106545.365473,"log":"2024-03-22T11:22:23.962954 d07b4833-e5ef-4a12-a316-216f3bf40471\nTraceback (most recent call last):\n  File \"/tmp/script.py\", line 12, in <module>\n    raise RuntimeError(f\"Oops! {uuid.uuid4()}\")\nRuntimeError: Oops! a84e2b0e-fb1b-4483-93bd-29cb62e4c81d\n\n","stream":"stdout"}
{"date":1711106545.365497,"log":"2024-03-22T11:22:24.363447 ade91ead-9038-4508-a2d3-a572fae1492a\n","stream":"stdout"}
{"date":1711106545.365501,"log":"2024-03-22T11:22:24.563603 237758d3-450b-42c8-84d0-f67af080a4ef\n","stream":"stdout"}
{"date":1711106545.365469,"log":"2024-03-22T11:22:23.762806 1cd3e764-4fe7-4362-81cf-e25a06bd8e65\n","stream":"stdout"}
{"date":1711106545.365473,"log":"2024-03-22T11:22:23.962954 d07b4833-e5ef-4a12-a316-216f3bf40471\nTraceback (most recent call last):\n  File \"/tmp/script.py\", line 12, in <module>\n    raise RuntimeError(f\"Oops! {uuid.uuid4()}\")\nRuntimeError: Oops! a84e2b0e-fb1b-4483-93bd-29cb62e4c81d\n\n","stream":"stdout"}
{"date":1711106545.365497,"log":"2024-03-22T11:22:24.363447 ade91ead-9038-4508-a2d3-a572fae1492a\n","stream":"stdout"}
Full debug output

``` [2024/03/22 15:25:25] [ info] Configuration: [2024/03/22 15:25:25] [ info] flush time | 1.000000 seconds [2024/03/22 15:25:25] [ info] grace | 5 seconds [2024/03/22 15:25:25] [ info] daemon | 0 [2024/03/22 15:25:25] [ info] ___________ [2024/03/22 15:25:25] [ info] inputs: [2024/03/22 15:25:25] [ info] tail [2024/03/22 15:25:25] [ info] ___________ [2024/03/22 15:25:25] [ info] filters: [2024/03/22 15:25:25] [ info] multiline.0 [2024/03/22 15:25:25] [ info] ___________ [2024/03/22 15:25:25] [ info] outputs: [2024/03/22 15:25:25] [ info] stdout.0 [2024/03/22 15:25:25] [ info] ___________ [2024/03/22 15:25:25] [ info] collectors: [2024/03/22 15:25:25] [ info] [fluent bit] version=2.2.2, commit=8acce5163a, pid=246652 [2024/03/22 15:25:25] [debug] [engine] coroutine stack size: 24576 bytes (24.0K) [2024/03/22 15:25:25] [ info] [storage] ver=1.5.1, type=memory, sync=normal, checksum=off, max_chunks_up=128 [2024/03/22 15:25:25] [ info] [cmetrics] version=0.6.6 [2024/03/22 15:25:25] [ info] [ctraces ] version=0.4.0 [2024/03/22 15:25:25] [ info] [input:tail:tail.0] initializing [2024/03/22 15:25:25] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only) [2024/03/22 15:25:25] [debug] [tail:tail.0] created event channels: read=21 write=22 [2024/03/22 15:25:25] [debug] [input:tail:tail.0] flb_tail_fs_inotify_init() initializing inotify tail input [2024/03/22 15:25:25] [debug] [input:tail:tail.0] inotify watch fd=27 [2024/03/22 15:25:25] [debug] [input:tail:tail.0] scanning path input.log [2024/03/22 15:25:25] [debug] [input:tail:tail.0] inode=433868 with offset=0 appended as input.log [2024/03/22 15:25:25] [debug] [input:tail:tail.0] scan_glob add(): input.log, inode 433868 [2024/03/22 15:25:25] [debug] [input:tail:tail.0] 1 new files found on path 'input.log' [2024/03/22 15:25:25] [ info] [filter:multiline:multiline.0] created emitter: emitter_for_multiline.0 [2024/03/22 15:25:25] [ info] [input:emitter:emitter_for_multiline.0] initializing [2024/03/22 15:25:25] [ info] [input:emitter:emitter_for_multiline.0] storage_strategy='memory' (memory only) [2024/03/22 15:25:25] [debug] [emitter:emitter_for_multiline.0] created event channels: read=29 write=30 [2024/03/22 15:25:25] [debug] [stdout:stdout.0] created event channels: read=32 write=33 [2024/03/22 15:25:25] [debug] [router] match rule tail.0:stdout.0 [2024/03/22 15:25:25] [debug] [router] match rule emitter.1:stdout.0 [2024/03/22 15:25:25] [ info] [sp] stream processor started [2024/03/22 15:25:25] [ info] [output:stdout:stdout.0] worker #0 started [2024/03/22 15:25:25] [ info] [filter:multiline:multiline.0] created new multiline stream for tail.0_tail.0 [2024/03/22 15:25:25] [debug] [filter:multiline:multiline.0] Created new ML stream for tail.0_tail.0 [2024/03/22 15:25:25] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0 [2024/03/22 15:25:25] [debug] [input chunk] update output instances with new chunk size diff=297, records=1, input=emitter_for_multiline.0 [2024/03/22 15:25:25] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0 [2024/03/22 15:25:25] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0 [2024/03/22 15:25:25] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0 [2024/03/22 15:25:25] [debug] [input chunk] update output instances with new chunk size diff=297, records=1, input=emitter_for_multiline.0 [2024/03/22 15:25:25] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0 [2024/03/22 15:25:25] [debug] [input:tail:tail.0] [static files] processed 2.1K [2024/03/22 15:25:25] [ info] [input:tail:tail.0] inode=433868 file=input.log ended, stop [2024/03/22 15:25:25] [debug] [input:tail:tail.0] inode=433868 file=input.log promote to TAIL_EVENT [2024/03/22 15:25:25] [ info] [input:tail:tail.0] inotify_fs_add(): inode=433868 watch_fd=1 name=input.log [2024/03/22 15:25:25] [debug] [input:tail:tail.0] [static files] processed 0b, done [2024/03/22 15:25:25] [debug] [task] created task=0x749f4c047950 id=0 OK [2024/03/22 15:25:25] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0 [2024/03/22 15:25:25] [ warn] [engine] service will shutdown in max 5 seconds [2024/03/22 15:25:25] [ info] [input] pausing tail.0 {"date":1711106545.365469,"log":"2024-03-22T11:22:23.762806 1cd3e764-4fe7-4362-81cf-e25a06bd8e65\n","stream":"stdout"} {"date":1711106545.365473,"log":"2024-03-22T11:22:23.962954 d07b4833-e5ef-4a12-a316-216f3bf40471\nTraceback (most recent call last):\n File \"/tmp/script.py\", line 12, in \n raise RuntimeError(f\"Oops! {uuid.uuid4()}\")\nRuntimeError: Oops! a84e2b0e-fb1b-4483-93bd-29cb62e4c81d\n\n","stream":"stdout"} {"date":1711106545.365497,"log":"2024-03-22T11:22:24.363447 ade91ead-9038-4508-a2d3-a572fae1492a\n","stream":"stdout"} {"date":1711106545.365501,"log":"2024-03-22T11:22:24.563603 237758d3-450b-42c8-84d0-f67af080a4ef\n","stream":"stdout"} {"date":1711106545.365469,"log":"2024-03-22T11:22:23.762806 1cd3e764-4fe7-4362-81cf-e25a06bd8e65\n","stream":"stdout"} {"date":1711106545.365473,"log":"2024-03-22T11:22:23.962954 d07b4833-e5ef-4a12-a316-216f3bf40471\nTraceback (most recent call last):\n File \"/tmp/script.py\", line 12, in \n raise RuntimeError(f\"Oops! {uuid.uuid4()}\")\nRuntimeError: Oops! a84e2b0e-fb1b-4483-93bd-29cb62e4c81d\n\n","stream":"stdout"} {"date":1711106545.365497,"log":"2024-03-22T11:22:24.363447 ade91ead-9038-4508-a2d3-a572fae1492a\n","stream":"stdout"} [2024/03/22 15:25:25] [debug] [out flush] cb_destroy coro_id=0 [2024/03/22 15:25:25] [debug] [task] destroy task=0x749f4c047950 (task_id=0) [2024/03/22 15:25:25] [ info] [engine] service has stopped (0 pending tasks) [2024/03/22 15:25:25] [ info] [input] pausing tail.0 [2024/03/22 15:25:25] [ info] [output:stdout:stdout.0] thread worker #0 stopping... [2024/03/22 15:25:25] [ info] [output:stdout:stdout.0] thread worker #0 stopped [2024/03/22 15:25:25] [debug] [input:tail:tail.0] inode=433868 removing file name input.log [2024/03/22 15:25:25] [ info] [input:tail:tail.0] inotify_fs_remove(): inode=433868 watch_fd=1 ```

Exit_On_Eof False in fluent-bit.conf and duplicate all lines in input.log

{"date":1711106545.365469,"log":"2024-03-22T11:22:23.762806 1cd3e764-4fe7-4362-81cf-e25a06bd8e65\n","stream":"stdout"}
{"date":1711106545.365473,"log":"2024-03-22T11:22:23.962954 d07b4833-e5ef-4a12-a316-216f3bf40471\nTraceback (most recent call last):\n  File \"/tmp/script.py\", line 12, in <module>\n    raise RuntimeError(f\"Oops! {uuid.uuid4()}\")\nRuntimeError: Oops! a84e2b0e-fb1b-4483-93bd-29cb62e4c81d\n\n","stream":"stdout"}
{"date":1711106545.365497,"log":"2024-03-22T11:22:24.363447 ade91ead-9038-4508-a2d3-a572fae1492a\n","stream":"stdout"}
{"date":1711106545.365501,"log":"2024-03-22T11:22:24.563603 237758d3-450b-42c8-84d0-f67af080a4ef\n","stream":"stdout"}
{"date":1711106545.365469,"log":"2024-03-22T11:22:23.762806 1cd3e764-4fe7-4362-81cf-e25a06bd8e65\n","stream":"stdout"}
{"date":1711106545.365473,"log":"2024-03-22T11:22:23.962954 d07b4833-e5ef-4a12-a316-216f3bf40471\nTraceback (most recent call last):\n  File \"/tmp/script.py\", line 12, in <module>\n    raise RuntimeError(f\"Oops! {uuid.uuid4()}\")\nRuntimeError: Oops! a84e2b0e-fb1b-4483-93bd-29cb62e4c81d\n\n","stream":"stdout"}
{"date":1711106545.365497,"log":"2024-03-22T11:22:24.363447 ade91ead-9038-4508-a2d3-a572fae1492a\n","stream":"stdout"}
[2024/03/22 15:28:12] [debug] [out flush] cb_destroy coro_id=0
[2024/03/22 15:28:12] [debug] [task] destroy task=0x7497c4047950 (task_id=0)
[2024/03/22 15:28:15] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0
[2024/03/22 15:28:15] [debug] [task] created task=0x7497c4047b20 id=0 OK
[2024/03/22 15:28:15] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
{"date":1711106545.365501,"log":"2024-03-22T11:22:24.563603 237758d3-450b-42c8-84d0-f67af080a4ef\n","stream":"stdout"}
Full debug output

``` [2024/03/22 15:28:12] [ info] Configuration: [2024/03/22 15:28:12] [ info] flush time | 1.000000 seconds [2024/03/22 15:28:12] [ info] grace | 5 seconds [2024/03/22 15:28:12] [ info] daemon | 0 [2024/03/22 15:28:12] [ info] ___________ [2024/03/22 15:28:12] [ info] inputs: [2024/03/22 15:28:12] [ info] tail [2024/03/22 15:28:12] [ info] ___________ [2024/03/22 15:28:12] [ info] filters: [2024/03/22 15:28:12] [ info] multiline.0 [2024/03/22 15:28:12] [ info] ___________ [2024/03/22 15:28:12] [ info] outputs: [2024/03/22 15:28:12] [ info] stdout.0 [2024/03/22 15:28:12] [ info] ___________ [2024/03/22 15:28:12] [ info] collectors: [2024/03/22 15:28:12] [ info] [fluent bit] version=2.2.2, commit=8acce5163a, pid=249374 [2024/03/22 15:28:12] [debug] [engine] coroutine stack size: 24576 bytes (24.0K) [2024/03/22 15:28:12] [ info] [storage] ver=1.5.1, type=memory, sync=normal, checksum=off, max_chunks_up=128 [2024/03/22 15:28:12] [ info] [cmetrics] version=0.6.6 [2024/03/22 15:28:12] [ info] [ctraces ] version=0.4.0 [2024/03/22 15:28:12] [ info] [input:tail:tail.0] initializing [2024/03/22 15:28:12] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only) [2024/03/22 15:28:12] [debug] [tail:tail.0] created event channels: read=21 write=22 [2024/03/22 15:28:12] [debug] [input:tail:tail.0] flb_tail_fs_inotify_init() initializing inotify tail input [2024/03/22 15:28:12] [debug] [input:tail:tail.0] inotify watch fd=27 [2024/03/22 15:28:12] [debug] [input:tail:tail.0] scanning path input.log [2024/03/22 15:28:12] [debug] [input:tail:tail.0] inode=433868 with offset=0 appended as input.log [2024/03/22 15:28:12] [debug] [input:tail:tail.0] scan_glob add(): input.log, inode 433868 [2024/03/22 15:28:12] [debug] [input:tail:tail.0] 1 new files found on path 'input.log' [2024/03/22 15:28:12] [ info] [filter:multiline:multiline.0] created emitter: emitter_for_multiline.0 [2024/03/22 15:28:12] [ info] [input:emitter:emitter_for_multiline.0] initializing [2024/03/22 15:28:12] [ info] [input:emitter:emitter_for_multiline.0] storage_strategy='memory' (memory only) [2024/03/22 15:28:12] [debug] [emitter:emitter_for_multiline.0] created event channels: read=29 write=30 [2024/03/22 15:28:12] [debug] [stdout:stdout.0] created event channels: read=32 write=33 [2024/03/22 15:28:12] [debug] [router] match rule tail.0:stdout.0 [2024/03/22 15:28:12] [debug] [router] match rule emitter.1:stdout.0 [2024/03/22 15:28:12] [ info] [sp] stream processor started [2024/03/22 15:28:12] [ info] [output:stdout:stdout.0] worker #0 started [2024/03/22 15:28:12] [ info] [filter:multiline:multiline.0] created new multiline stream for tail.0_tail.0 [2024/03/22 15:28:12] [debug] [filter:multiline:multiline.0] Created new ML stream for tail.0_tail.0 [2024/03/22 15:28:12] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0 [2024/03/22 15:28:12] [debug] [input chunk] update output instances with new chunk size diff=297, records=1, input=emitter_for_multiline.0 [2024/03/22 15:28:12] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0 [2024/03/22 15:28:12] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0 [2024/03/22 15:28:12] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0 [2024/03/22 15:28:12] [debug] [input chunk] update output instances with new chunk size diff=297, records=1, input=emitter_for_multiline.0 [2024/03/22 15:28:12] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0 [2024/03/22 15:28:12] [debug] [input:tail:tail.0] [static files] processed 2.1K [2024/03/22 15:28:12] [debug] [input:tail:tail.0] inode=433868 file=input.log promote to TAIL_EVENT [2024/03/22 15:28:12] [ info] [input:tail:tail.0] inotify_fs_add(): inode=433868 watch_fd=1 name=input.log [2024/03/22 15:28:12] [debug] [input:tail:tail.0] [static files] processed 0b, done [2024/03/22 15:28:12] [debug] [task] created task=0x7497c4047950 id=0 OK [2024/03/22 15:28:12] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0 {"date":1711106545.365469,"log":"2024-03-22T11:22:23.762806 1cd3e764-4fe7-4362-81cf-e25a06bd8e65\n","stream":"stdout"} {"date":1711106545.365473,"log":"2024-03-22T11:22:23.962954 d07b4833-e5ef-4a12-a316-216f3bf40471\nTraceback (most recent call last):\n File \"/tmp/script.py\", line 12, in \n raise RuntimeError(f\"Oops! {uuid.uuid4()}\")\nRuntimeError: Oops! a84e2b0e-fb1b-4483-93bd-29cb62e4c81d\n\n","stream":"stdout"} {"date":1711106545.365497,"log":"2024-03-22T11:22:24.363447 ade91ead-9038-4508-a2d3-a572fae1492a\n","stream":"stdout"} {"date":1711106545.365501,"log":"2024-03-22T11:22:24.563603 237758d3-450b-42c8-84d0-f67af080a4ef\n","stream":"stdout"} {"date":1711106545.365469,"log":"2024-03-22T11:22:23.762806 1cd3e764-4fe7-4362-81cf-e25a06bd8e65\n","stream":"stdout"} {"date":1711106545.365473,"log":"2024-03-22T11:22:23.962954 d07b4833-e5ef-4a12-a316-216f3bf40471\nTraceback (most recent call last):\n File \"/tmp/script.py\", line 12, in \n raise RuntimeError(f\"Oops! {uuid.uuid4()}\")\nRuntimeError: Oops! a84e2b0e-fb1b-4483-93bd-29cb62e4c81d\n\n","stream":"stdout"} {"date":1711106545.365497,"log":"2024-03-22T11:22:24.363447 ade91ead-9038-4508-a2d3-a572fae1492a\n","stream":"stdout"} [2024/03/22 15:28:12] [debug] [out flush] cb_destroy coro_id=0 [2024/03/22 15:28:12] [debug] [task] destroy task=0x7497c4047950 (task_id=0) [2024/03/22 15:28:15] [debug] [input chunk] update output instances with new chunk size diff=110, records=1, input=emitter_for_multiline.0 [2024/03/22 15:28:15] [debug] [task] created task=0x7497c4047b20 id=0 OK [2024/03/22 15:28:15] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0 {"date":1711106545.365501,"log":"2024-03-22T11:22:24.563603 237758d3-450b-42c8-84d0-f67af080a4ef\n","stream":"stdout"} [2024/03/22 15:28:15] [debug] [out flush] cb_destroy coro_id=1 [2024/03/22 15:28:15] [debug] [task] destroy task=0x7497c4047b20 (task_id=0) ^C[2024/03/22 15:28:39] [engine] caught signal (SIGINT) [2024/03/22 15:28:39] [ warn] [engine] service will shutdown in max 5 seconds [2024/03/22 15:28:39] [ info] [input] pausing tail.0 [2024/03/22 15:28:40] [ info] [engine] service has stopped (0 pending tasks) [2024/03/22 15:28:40] [ info] [input] pausing tail.0 [2024/03/22 15:28:40] [ info] [output:stdout:stdout.0] thread worker #0 stopping... [2024/03/22 15:28:40] [ info] [output:stdout:stdout.0] thread worker #0 stopped [2024/03/22 15:28:40] [debug] [input:tail:tail.0] inode=433868 removing file name input.log [2024/03/22 15:28:40] [ info] [input:tail:tail.0] inotify_fs_remove(): inode=433868 watch_fd=1 ```

Conclusion

Expected behavior

Your Environment

Version used: 2.2.2 Operating System and version: Arch and Ubuntu

Additional context

This might be related to #3926 and #8353.

github-actions[bot] commented 2 months ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

pmeier commented 2 months ago

Still buggy in 3.0.7.

pmeier commented 1 month ago

Potential duplicate of #4918.