fluent / fluentd

Fluentd: Unified Logging Layer (project under CNCF)
https://www.fluentd.org
Apache License 2.0
12.85k stars 1.34k forks source link

Fluentd in_tail "unreadable" file causes "following tail of <file>" to stop and no logs pushed #3614

Closed salavessa closed 8 months ago

salavessa commented 2 years ago

Describe the bug

After a warning of an "unreadable" (likely due to rotation), no more logs were pushed (in_tail + pos_file). Reloading config or restarting fluentd sorts the issue. All other existing files being tracked continued to work as expected.

To Reproduce

Not able to reproduce at will.

Expected behavior

Logs to be pushed as usual after file rotation as fluentd recovers from the temporary "unreadable" file.

Your Environment

- Fluentd version: 1.14.4
- TD Agent version: 4.2.0
- Operating system: Ubuntu 20.04.3 LTS
- Kernel version: 5.11.0-1022-aws

Your Configuration

<source>
  @type tail
  path /var/log/containers/*.log
  pos_file /var/log/es-containers.log.pos
  tag kubernetes.*
  <parse>
    @type regexp
    expression /^(?<time>[^ ]+) (?<stream>[^ ]+) (?<logtag>[^ ]+) (?<log>.+)$/
    time_key time
    time_format %Y-%m-%dT%H:%M:%S.%N%z
  </parse>
  read_from_head true
</source>

Your Error Log

Relevant log entries with some context. When "detected rotation of ..." isn't followed by a "following tail of ..." then log file contents aren't being processed/pushed:

2022-02-01 01:26:33 +0000 [info]: #0 detected rotation of /var/log/containers/gateway-mtcw6_atom_pcms-gateway-e54ea9ee115f1127682b0d92e37404e4b1693c14edeea93e485f70b3110eecfc.log; waiting 5 seconds
2022-02-01 01:26:33 +0000 [info]: #0 following tail of /var/log/containers/gateway-mtcw6_atom_pcms-gateway-e54ea9ee115f1127682b0d92e37404e4b1693c14edeea93e485f70b3110eecfc.log
2022-02-01 01:32:53 +0000 [warn]: #0 /var/log/containers/gateway-mtcw6_atom_pcms-gateway-e54ea9ee115f1127682b0d92e37404e4b1693c14edeea93e485f70b3110eecfc.log unreadable. It is excluded and would be examined next time.
2022-02-01 01:32:54 +0000 [info]: #0 detected rotation of /var/log/containers/gateway-mtcw6_atom_pcms-gateway-e54ea9ee115f1127682b0d92e37404e4b1693c14edeea93e485f70b3110eecfc.log; waiting 5 seconds
2022-02-01 01:32:54 +0000 [info]: #0 following tail of /var/log/containers/gateway-mtcw6_atom_pcms-gateway-e54ea9ee115f1127682b0d92e37404e4b1693c14edeea93e485f70b3110eecfc.log
2022-02-01 01:38:04 +0000 [info]: #0 detected rotation of /var/log/containers/gateway-mtcw6_atom_pcms-gateway-e54ea9ee115f1127682b0d92e37404e4b1693c14edeea93e485f70b3110eecfc.log; waiting 5 seconds
2022-02-01 01:44:44 +0000 [info]: #0 detected rotation of /var/log/containers/gateway-mtcw6_atom_pcms-gateway-e54ea9ee115f1127682b0d92e37404e4b1693c14edeea93e485f70b3110eecfc.log; waiting 5 seconds
2022-02-01 01:53:15 +0000 [info]: #0 detected rotation of /var/log/containers/gateway-mtcw6_atom_pcms-gateway-e54ea9ee115f1127682b0d92e37404e4b1693c14edeea93e485f70b3110eecfc.log; waiting 5 seconds
[...]
---- after issuing a config reload (kill -SIGUSR2 <pid>) it starts to work fine again, i.e. "following tail of ..." 
2022-02-01 11:36:19 +0000 [info]: Reloading new config
2022-02-01 11:36:19 +0000 [info]: using configuration file: <ROOT>
[...]
2022-02-01 11:36:19 +0000 [info]: shutting down input plugin type=:tail plugin_id="object:c6c0"
[...]
2022-02-01 11:36:19 +0000 [info]: adding source type="tail"
2022-02-01 11:36:19 +0000 [info]: #0 shutting down fluentd worker worker=0
2022-02-01 11:36:19 +0000 [info]: #0 shutting down input plugin type=:tail plugin_id="object:c6c0"
[...]
2022-02-01 11:36:20 +0000 [info]: #0 restart fluentd worker worker=0
---- the entry below seems to be related with the actual underlying issue... the ruby object which stopped pushing logs has now been terminated as a new one was created
2022-02-01 11:36:20 +0000 [warn]: #0 /var/log/containers/gateway-mtcw6_atom_pcms-gateway-e54ea9ee115f1127682b0d92e37404e4b1693c14edeea93e485f70b3110eecfc.log already exists. use latest one: deleted #<struct Fluent::Plugin::TailInput::Entry path="/var/log/containers/gateway-mtcw6_atom_pcms-gateway-e54ea9ee115f1127682b0d92e37404e4b1693c14edeea93e485f70b3110eecfc.log", pos=10740032, ino=1797715, seek=1530>
2022-02-01 11:36:20 +0000 [info]: #0 following tail of /var/log/containers/gateway-mtcw6_atom_pcms-gateway-e54ea9ee115f1127682b0d92e37404e4b1693c14edeea93e485f70b3110eecfc.log
2022-02-01 11:37:30 +0000 [info]: #0 detected rotation of /var/log/containers/gateway-mtcw6_atom_pcms-gateway-e54ea9ee115f1127682b0d92e37404e4b1693c14edeea93e485f70b3110eecfc.log; waiting 5 seconds
2022-02-01 11:37:30 +0000 [info]: #0 following tail of /var/log/containers/gateway-mtcw6_atom_pcms-gateway-e54ea9ee115f1127682b0d92e37404e4b1693c14edeea93e485f70b3110eecfc.log
2022-02-01 11:43:20 +0000 [info]: #0 detected rotation of /var/log/containers/gateway-mtcw6_atom_pcms-gateway-e54ea9ee115f1127682b0d92e37404e4b1693c14edeea93e485f70b3110eecfc.log; waiting 5 seconds
2022-02-01 11:43:20 +0000 [info]: #0 following tail of /var/log/containers/gateway-mtcw6_atom_pcms-gateway-e54ea9ee115f1127682b0d92e37404e4b1693c14edeea93e485f70b3110eecfc.log

Additional context

This issue seems be related with #3586 but unfortunately I didn't check the pos file while the issue was happening so can't tell if it presented unexpected values for the failing file.

mizunashi-mana commented 2 years ago

We have encountered the same problem twice. Last month we upgraded fluentd from 1.13.3 (td-agent 4.2.0) to 1.14.3 (td-agent 4.3.0), and then we have got the problem.

In our case, we are tracking more than 3 log files with the tail plugin, and only the file with the highest log flow has this problem. From the metrics produced by prometheus plugin, the file position always fluctuates by at least 400000/sec and the rotation happens at least once every 10 minutes.

The input source is set as follows:

<source>
  @type tail
  format ...
  path /var/log/server.log
  pos_file /var/log/td-agent/server.log.pos
  tag input.server.log
</source>

I will also share the record of the investigation when the problem occurs. First, fluentd suddenly, without warning, stops tracking new rotated files and only logs the following in the stdout log.

detected rotation of /var/log/server.log; waiting 5 seconds

Normally, it would have been followed by an announcement that a new file was being tracked, as follows

The following tail of /var/log/server.log

However, we have not seen this announcement at all after the problem occurs. Also, when I view the file descriptors that fluentd is tracking, I see the following:

$ sudo ls "/proc/$(pgrep -f 'td-agent.*--under-supervisor')/fd" -l | grep server
lr-x------ 1 td-agent td-agent 64 Feb 5 01:41 49 -> /var/log/server.3.log

This is the rotated file, and we see that fluentd is still tracking the old file, not the new one. Also, looking at the position file, we see the following:

$ sudo cat /var/log/td-agent/server.log.pos 
/var/log/server.log 000000002001f2d0 000000003a000042
/var/log/server.log 0000000000000000 0000000000000000

As far as the situation is concerned, it seems that something triggers fluentd to stop tracking new files altogether during rotate. We have about 300 servers running 24/7 and have only had two problems in the past month, so it seems to be pretty rare for problems to occur. However, since we haven't had any problems with fluentd 1.13.3 for 6 months, it seems likely that there was some kind of regression in fluentd 1.13.3 -> 1.14.3.

We are planning to downgrade fluentd.

ashie commented 2 years ago

Memo: From v1.14.0 to v1.14.3, following changes are made to in_tail:

mizunashi-mana commented 2 years ago

We downgraded td-agent 4.3.0 (fluentd v1.14.3) to td-agent 4.2.0 (fluentd v1.13.3), and still have problems. We broadened monitoring targets to 500 servers (approx 1500 files) from 200 servers (approx 400 files) as a result of this issue. Then, we found the 2 servers (2 files) having this issue after a week from td-agent downgraded. Upgrading fluentd 1.13.3 -> 1.14.3 does not seem to be the cause, sorry.

It seemed that some of our log files were affected and some were not. The differences are as follows.

We are planning to change rotation policy and retry upgrading td-agent. And, we will comment some updates if we get.

mizunashi-mana commented 2 years ago

We enabled debug logging with fluentd 1.13.3 and encountered this issue. The state was same:

And, fluentd reported Skip update_watcher because watcher has been already updated by other inotify event following detected rotation of /var/log/server.log; waiting 5 seconds. The message seems to be reported by https://github.com/fluent/fluentd/blob/v1.13.3/lib/fluent/plugin/in_tail.rb#L482. It seems to be strongly related position file having two entries.

We searched some other debug logs related this issue but we had no such logs. fluentd reported no other debug logs but only pairs of detected rotation of /var/log/server.log; waiting 5 seconds and The following tail of /var/log/server.log.

ashie commented 2 years ago

Thanks for your report!

And, fluentd reported Skip update_watcher because watcher has been already updated by other inotify event following detected rotation of /var/log/server.log; waiting 5 seconds. The message seems to be reported by https://github.com/fluent/fluentd/blob/v1.13.3/lib/fluent/plugin/in_tail.rb#L482. It seems to be strongly related position file having two entries.

To tell the truth, I was suspecting it but I couldn't confirm it because I can't yet reproduce it. Your report it very helpful for me.

Although I'm not yet sure the mechanism of this issue, you might be able to avoid this issue by disabling stat watcher (enable_stat_watcher false).

ashie commented 2 years ago

Similar reports:

issue date user fluentd version read_from_head follow_inodes  work around
#3586 2021-12-28 @jasminehung v1.14.3 true true pos_file_compaction_interval 3m
#3614 2022-02-02 @salavessa v1.14.4 true false  enable_stat_watcher false
  2022-02-14 @mizunashi-mana 1.14.3, v1.13.3 false false  
#3661 2022-03-08 @shenmuxiaosen v1.13.3 false false v1.12.4 was stable
  2022-07-08 @ljuaneda ? ? ? v1.11.5 was stable
#3800 2022-06-27 @bjg2 v1.13.3 true false  
#3838 2022-08-01 @bertpassek v1.15.1 ? ?
ashie commented 2 years ago

From #3586

Your Error Log

There's no error log when this happens, but there were some info logs with special behavior:
2021–12–27 07:37:52 +0000 [info]: #0 detected rotation of /var/log/containers/myapp-765f4884fb-qfzhw_myapp-07e1fdd304cd14fa49ad23e034ef806d16c9b5b08c84fa4f269594fc176464a8.log; waiting 5 seconds
2021–12–27 07:37:52 +0000 [info]: #0 following tail of /var/log/containers/myapp-765f4884fb-qfzhw_myapp-07e1fdd304cd14fa49ad23e034ef806d16c9b5b08c84fa4f269594fc176464a8.log
---- at list moment myapp log stop being sent to elasticsearch (no more "following tail" info log for /var/log/containers/myapp) ------
2021–12–27 07:38:43 +0000 [info]: #0 detected rotation of /var/log/containers/simm-natseventhandler-p2–765f4884fb-qfzhw_simm-a_simm-natseventhandler-p2–07e1fdd304cd14fa49ad23e034ef806d16c9b5b08c84fa4f269594fc176464a8.log; waiting 5 seconds
2021–12–27 07:39:47 +0000 [info]: #0 detected rotation of /var/log/containers/simm-natseventhandler-p2–765f4884fb-qfzhw_simm-a_simm-natseventhandler-p2–07e1fdd304cd14fa49ad23e034ef806d16c9b5b08c84fa4f269594fc176464a8.log; waiting 5 seconds
2021–12–27 07:40:51 +0000 [info]: #0 detected rotation of /var/log/containers/simm-natseventhandler-p2–765f4884fb-qfzhw_simm-a_simm-natseventhandler-p2–07e1fdd304cd14fa49ad23e034ef806d16c9b5b08c84fa4f269594fc176464a8.log; waiting 5 seconds

In the pos file we found the records for myapp is strange:
1. There are duplicated rows tracking for myapp container (Will fluentd use the last or first line? or both? or none? ). For the other hundreds of normal pods, there's only 1 row for each container. 
2. Record with position "ffffffffffffffff 0000000000000000" keep appearing. Does this mean pos_file is not able to track the container?

Results of querying "myapp" from the pos file: 
Line 241: /var/log/containers/myapp-765f4884fb-qfzhw_myapp-07e1fdd304cd14fa49ad23e034ef806d16c9b5b08c84fa4f269594fc176464a8.log 0000000000bb1725 00000000000a0142
Line 260: /var/log/containers/myapp-765f4884fb-qfzhw_myapp-07e1fdd304cd14fa49ad23e034ef806d16c9b5b08c84fa4f269594fc176464a8.log ffffffffffffffff 0000000000000000
Line 292: /var/log/containers/myapp-765f4884fb-qfzhw_myapp-07e1fdd304cd14fa49ad23e034ef806d16c9b5b08c84fa4f269594fc176464a8.log 0000000000bf8c99 00000000000a0027
Line 302: /var/log/containers/myapp-765f4884fb-qfzhw_myapp-07e1fdd304cd14fa49ad23e034ef806d16c9b5b08c84fa4f269594fc176464a8.log ffffffffffffffff 0000000000000000
Line 561: /var/log/containers/myapp-765f4884fb-qfzhw_myapp-07e1fdd304cd14fa49ad23e034ef806d16c9b5b08c84fa4f269594fc176464a8.log 0000000000afaeea 00000000000a001c
Line 785: /var/log/containers/myapp-765f4884fb-qfzhw_myapp-07e1fdd304cd14fa49ad23e034ef806d16c9b5b08c84fa4f269594fc176464a8.log ffffffffffffffff 0000000000000000
Line 804: /var/log/containers/myapp-765f4884fb-qfzhw_myapp-07e1fdd304cd14fa49ad23e034ef806d16c9b5b08c84fa4f269594fc176464a8.log 000000000016e1a8 00000000000a0019

Additional context

Adding pos_file_compaction_interval and set it to 3m seems to fix the log missing problem, but still not able to explain the root cause of the case.

ashie commented 2 years ago

There are several similar reports and most of them said:

When tailing is stopped,

202x–xx–xx xx:xx:xx +0000 [info]: #0 detected rotation of /var/log/containersxxx.log; waiting 5 seconds

was observed but

202x–xx–xx xx:xx:xx +0000 [info]: #0 following tail of /var/log/containers/xxx.log

wasn't observed.

In addition, a user reports that following debug message is observed when log level is debug:

202x–xx–xx xx:xx:xx +0000 [debug]: #0 Skip update_watcher because watcher has been already updated by other inotify event

So, it seems caused by the following code: https://github.com/fluent/fluentd/blob/438a82aead488a86180cd484bbc4e7e344a9032b/lib/fluent/plugin/in_tail.rb#L489-L498

salavessa commented 2 years ago

For what's worth: after adding enable_stat_watcher false back in March (as mentioned by @ashie here) none of my 120+ servers experienced this issue again.

wangchaoforever commented 2 years ago

@ashie Is this one fixed?

ashie commented 2 years ago

At v1.15.1 we added a log message like the following to investigate this issue:

2022-08-01 17:57:19 +0900 [warn]: #0 Skip update_watcher because watcher has been already updated by other inotify event path="logs/hige.log" inode=59511737 inode_in_pos_file=0

If you see such log message, please report it.

vparfonov commented 2 years ago

Hello, @ashie! We have something similar to: "no more log pushed" and found duplication in pos file. Like this:

/var/log/pods/pod-l4wrv/service/0.log   ffffffffffffffff    000000008b00ad36

/var/log/pods/pod-5bmg8/service/0.log   000000000333fd2e    000000005a00673b
/var/log/pods/pod-5bmg8/service/0.log   0000000004b2a353    000000005a00673e
/var/log/pods/pod-5bmg8/service/0.log   0000000003666a64    000000005a00673c
/var/log/pods/pod-5bmg8/service/0.log   0000000006a53eab    000000005a00673e
/var/log/pods/pod-5bmg8/service/0.log   0000000004a21b23    000000005a01e248
/var/log/pods/pod-5bmg8/service/0.log   0000000000000000    0000000000000000

Maybe this can help you.

ashie commented 2 years ago

@vparfonov Could you provide your in_tail config? Do you use follow_inode true?

vparfonov commented 2 years ago

@ashie, looks like not, we use default setting for follow_inode

<source>
  @type tail
  @id container-input
  path "/var/log/pods/**/*.log"
  exclude_path ["/var/log/pods/openshift-logging_collector-*/*/*.log", "/var/log/pods/openshift-logging_elasticsearch-*/*/*.log", "/var/log/pods/openshift-logging_kibana-*/*/*.log"]
  pos_file "/var/lib/fluentd/pos/es-containers.log.pos"
  refresh_interval 5
  rotate_wait 5
  tag kubernetes.*
  read_from_head "true"
  skip_refresh_on_startup true
  @label @MEASURE
  <parse>
    @type multi_format
    <pattern>
      format json
      time_format '%Y-%m-%dT%H:%M:%S.%N%Z'
      keep_time_key true
    </pattern>
    <pattern>
      format regexp
      expression /^(?<time>[^\s]+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/
      time_format '%Y-%m-%dT%H:%M:%S.%N%:z'
      keep_time_key true
    </pattern>
  </parse>
</source>
artheus commented 2 years ago

I looked a bit at this now. What seems to happen is that in the in_tail.rb it checks if File.readable? which is false in the case of a broken symlink. So in the case of log rotation, the symlink to the container log file will be broken for a while, in Kubernetes environments. If there are no log lines output within the timeout of the detected log rotation, the in_tail will assume the file to be readable, but no file is actually created yet. This could easily be fixed by polling the symlink, until it's no longer broken (or until some kind of max_retry is hit) I also think it should be announced somewhere that one should install capng to get the capabilities lib working properly. Which it was not, for us. And we're using the official docker images for running fluentd.

EDIT: This should be reproducible in a local k8s environment by creating a tiny app that outputs logs every 10th - 30th second. And running fluentd in_tail the log files there.

artheus commented 2 years ago

It is really weird however, that the re-examination done on the next refresh does not always pick up on that the symlinks are no longer broken, and that the file should now be readable. I have however found that in most of the cases where we will get no more logs, after a detected log rotation, the "detected rotation of ..." log entry was sent twice for the same file, on the same node, and in other cases when FluentD is able to pick up tailing the file again later, the log line appears only once.

artheus commented 2 years ago

I created a test case in test/plugin/test_in_tail.rb for just playing around with this exact event:

sub_test_case "multiple log rotations" do
    data(
      "128" => ["128-0.log", 128, "msg"],
      "256" => ["256-0.log", 256, "msg"],
      "512" => ["512-0.log", 512, "msg"],
      "1024" => ["1024-0.log", 1024, "msg"],
      "2048" => ["2048-0.log", 2048, "msg"],
      "4096" => ["4096-0.log", 4096, "msg"],
      "8192" => ["8192-0.log", 8192, "msg"],
    )

    def test_reproduce_err_after_rotations(data)
      file, num_lines, msg = data

      File.open("#{@tmp_dir}/#{file}", 'wb') do |f|
        num_lines.times do
          f.puts "#{msg}\n"
        end
      end

      conf = config_element("", "", {
        "path" => "#{@tmp_dir}/*.log.link",
        "pos_file" => "#{@tmp_dir}/tail.pos",
        "refresh_interval" => "1s",
        "read_from_head" => "true",
        "format" => "none",
        "rotate_wait" => "1s",
        "follow_inodes" => "true",
        "tag" => "t1",
      })

      link_name="#{@tmp_dir}/#{num_lines}.log.link"

      File.symlink(file, link_name)

      dl_opts = {log_level: Fluent::Log::LEVEL_DEBUG}
      logdev = $stdout
      logger = ServerEngine::DaemonLogger.new(logdev, dl_opts)
      log_instance = Fluent::Log.new(logger)

      rotations = 5
      rot_now = 1

      d = create_driver(conf, false)
      d.instance.log = log_instance
      d.run(timeout: 30) do
        sleep 1

        assert_equal(num_lines, d.record_count)

        # rotate logs
        while rot_now <= rotations do
          sleep 2

          puts "unlink #{link_name}"
          File.unlink(link_name)
          puts "symlink #{num_lines}-#{rot_now}.log #{link_name}"
          File.symlink("#{num_lines}-#{rot_now}.log", link_name)

          sleep 1

          File.open("#{@tmp_dir}/#{num_lines}-#{rot_now}.log", 'wb') do |f|
            num_lines.times do
              f.puts "#{msg}\n"
            end
          end

          assert_equal(num_lines*rot_now, d.record_count)

          rot_now = rot_now + 1
        end
      end
    end
  end

In this case, it seems to be working properly. But maybe we can help each-other in reproducing the error?

UPDATE: Changed to multiple files of different sizes, and changed log rotation to how kubelet actually does it. UPDATE 2: Code updated! Now it seems I've got something that somehow reproduces the error, in this very state the tests works fine. But if you comment out the "follow_inodes" => "true", line, the error comes up and the log line Skip update_watcher because watcher has been.... mentioned by @ashie above, is logged all the time. So I think that the follow_inodes option is not only important for preventing duplicate log messages, but also for tail on wildcards, and symlinks to work properly!

SriramDuvvuri commented 2 years ago

HI All,

We are facing a similar kind of issue where the td-agent.log & .pos files are not updating properly. In some cases , failing after when log rotation occurs, file gets created but nothing is generated as shown below. Even though if we restart the td-agent, the problem is not fixed

As per some previous comments, I have tried modifying the config with the below parameters, but nothing changes

enable_stat_watcher false read_from_head true follow_inodes true refresh_interval 5 rotate_wait 5 skip_refresh_on_startup true pos_file_compaction_interval 30m

As we have systems over RHEL 7/6 we are using fluentd version using 1.11.5 td-agent 3.8.1/3.8.0, in our environment Here is the config we are using, please provide us your inputs here.

[root]:/var/log/td-agent> ls -ltr
-rw-r--r-- 1 td-agent td-agent       68 Aug 29 20:00 s2.pos
-rw-r--r-- 1 td-agent td-agent      384 Aug 29 20:00 b1.pos
-rw-r--r-- 1 td-agent td-agent    37509 Aug 29 20:00 a3.pos
-rw-r--r-- 1 td-agent td-agent    41629 Aug 29 20:00 xx.pos
-rw-r----- 1 td-agent td-agent 18562883 Aug 29 20:38 **td-agent.log-20220830**
-rw-r----- 1 td-agent td-agent        **0 Aug 30 03:50 td-agent.log**
****

Config:

<source>
@type tail
  path /home/server/xx*.log
  exclude_path ["/home/server/xxyy*.log"]
  pos_file /var/log/td-agent/xxlog.pos
  tag xx.log
  enable_stat_watcher false
  read_from_head true
  follow_inodes true
  refresh_interval 5
  rotate_wait 5
  skip_refresh_on_startup true
  pos_file_compaction_interval 30m
  <parse>
    @type multiline
    format_firstline /<20/
    format1 /<(?<logDate>.*?)>-<(?<logThread>.*?)>-<(?<level>.*?)>-<(?<logClass>.*?)>-(?<logMessage>.*)/
    time_key logDate
    time_format %Y/%m/%d-%H:%M:%S.%N
  </parse>
</source>

PS. the same config is working fine in few other instances. I am not sure what was the problem here.

Appreciate your help in advance.

artheus commented 2 years ago

We just ran into the same problem again. Without the log line Skip update_watcher because watcher has been.... showing up. Now we only get these:

2022-09-13 06:36:37 +0000 [warn]: #0 [in_tail_container_logs] got incomplete line at shutdown from /var/log/containers/redacted-service-name-6b6754b75c-swn6z_backend_redacted-service-name-13f36bd8f73334e603ee6fe80642d971f6ea0b25ce966c6ec494c5b226e4d93c.log: ""

Is it wrong that we are running this in a k8s DaemonSet? Should we install FluentD/FluentBit directly on the vm's instead? Are there any special capabilities needed for the Pods created by the DaemonSet? I'm starting to feel a bit lost in this problem...

ashie commented 2 years ago

We just ran into the same problem again. Without the log line Skip update_watcher because watcher has been.... showing up. Now we only get these:

Are you using fluentd v1.15.1 or later?

masterashu-motorq commented 1 year ago

Hi, Came looking for solution of https://github.com/fluent/fluentd/issues/3661 I tried disabling enable_stat_watcher and reducing pos_file_compaction_interval but still facing fluentd failing to read logs when doing rotation

I am seeing this in the other order image

as well as in some pos files have duplicate entry

/var/log/containers/pod1-1-6b755c6694-flbxr_namespace1_pod1-1-f904346d3f6c060b78a594cef2e3161abd1eb10751b13b548ebe28e85c58fd51.log          ffffffffffffffff        000000000062ad18
/var/log/containers/pod1-1-6b755c6694-flbxr_namespace1_pod1-1-f904346d3f6c060b78a594cef2e3161abd1eb10751b13b548ebe28e85c58fd51.log          ffffffffffffffff        000000000062ac7f
/var/log/containers/pod1-1-6b755c6694-flbxr_namespace1_pod1-1-f904346d3f6c060b78a594cef2e3161abd1eb10751b13b548ebe28e85c58fd51.log          ffffffffffffffff        000000000062ad18
/var/log/containers/pod1-1-6b755c6694-flbxr_namespace1_pod1-1-f904346d3f6c060b78a594cef2e3161abd1eb10751b13b548ebe28e85c58fd51.log          ffffffffffffffff        000000000062ac7f
/var/log/containers/pod1-1-6b755c6694-flbxr_namespace1_pod1-1-f904346d3f6c060b78a594cef2e3161abd1eb10751b13b548ebe28e85c58fd51.log          ffffffffffffffff        000000000062ad18
/var/log/containers/pod1-1-6b755c6694-flbxr_namespace1_pod1-1-f904346d3f6c060b78a594cef2e3161abd1eb10751b13b548ebe28e85c58fd51.log          ffffffffffffffff        000000000062ac7f

Do we have any updated on this @ashie ?

artheus commented 1 year ago

Are there any workarounds for this, that anyone has been able to detect?

My current thoughts are:

Please contribute with whatever solutions you've set up to get this working as expected. In our company, we must follow government laws about audit logging. So we cannot afford to loose logs due to this bug.

artheus commented 1 year ago

We just ran into the same problem again. Without the log line Skip update_watcher because watcher has been.... showing up. Now we only get these:

Are you using fluentd v1.15.1 or later?

Yes. We are currently using the Docker image fluent/fluentd:v1.15.1-debian-1.1

zmx commented 1 year ago

Fluentd v1.15.1 with enable_stat_watcher false and pos_file_compaction_interval 3m and follow_inodes true still drop 50~70% logs for us. The load is around 20k logs/s

We see the pos file have duplicate entries and all zero inoder number

Our workaround: Using fluentbit in_tail plugin, then out_forward to fluentd using existing rules.

zmx commented 1 year ago

This bug happens on our environment for version 1.15.1 with all workaround above. Drop almost 70% logs with 20k logs/s.

We see exactly the same POS file corrupt mention in this ticket.

Our workaround: change to fluentbit tail plugin , out_forward plugin to fluentd to keep existing rules.

Morten Hekkvang @.***>於 2022年10月10日 週一,下午3:55寫道:

Are there any workarounds for this, that anyone has been able to detect?

My current thoughts are:

  • Run FluentD on VM directly, instead of running it as a DaemonSet in K8S (w. mounted /var/log)
  • Create our own tail plugin, that might be able to to a better job with tailing and log rotations.
    • E.g. Not only watch the symlinks, but also watch the files symlinks point to, to be able to get an event-driven async way of detecting log rotation, and when new log files are created.
  • Alternatively create some middle-man app that will tail and detect log rotations properly, and send logs to FluentD via e.g. http

Please contribute with whatever solutions you've set up to get this working as expected. In our company, we must follow government laws about audit logging. So we cannot afford to loose logs due to this bug.

— Reply to this email directly, view it on GitHub https://github.com/fluent/fluentd/issues/3614#issuecomment-1272919266, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAQ5JE7MQNA6YHSCIWLJS3WCPDY3ANCNFSM5NJYHT4A . You are receiving this because you are subscribed to this thread.Message ID: @.***>

masterashu-motorq commented 1 year ago

At v1.15.1 we added a log message like the following to investigate this issue:

2022-08-01 17:57:19 +0900 [warn]: #0 Skip update_watcher because watcher has been already updated by other inotify event path="logs/hige.log" inode=59511737 inode_in_pos_file=0

If you see such log message, please report it.

Getting similar log

2022-11-14 23:14:57 +0000 [info]: #0 detected rotation of /var/log/pods/processor-0-65dc565cc8-hq469_0f06669b-4521-4297-80db-ead8aaecde4e/core-ford-processor-0/1.log; waiting 5 seconds
2022-11-14 23:14:57 +0000 [warn]: #0 Skip update_watcher because watcher has been already updated by other inotify event path="/var/log/pods/processor-0-65dc565cc8-hq469_0f06669b-4521-4297-80db-ead8aaecde4e/core-ford-processor-0/1.log" inode=4397239 inode_in_pos_file=0
hawk-bernhard-altendorfer commented 1 year ago

fluentd v1.15-debian-opensearch-1 -> gem 'fluentd' version '1.15.3'

<source>
              @type tail
              @id in_tail_container_logs
              path /var/log/containers/*.log
              follow_inodes true
              refresh_interval 10
              rotate_wait 1
              enable_stat_watcher false
              pos_file /var/log/fluentd-containers.log.pos
              pos_file_compaction_interval 12h
              tag raw.kubernetes.*
              read_from_head true
              read_bytes_limit_per_second 1048576000
              max_line_size 2MB
              <parse>
                @type multi_format
                <pattern>
                  format json
                  time_key time
                  time_type string
                  time_format "%Y-%m-%dT%H:%M:%S.%NZ"
                  keep_time_key false
                </pattern>
                <pattern>
                  format regexp
                  expression /^(?<time>.+) (?<stream>stdout|stderr)( (.))? (?<log>.*)$/
                  time_format '%Y-%m-%dT%H:%M:%S.%NZ'
                  keep_time_key false
                </pattern>
              </parse>
              emit_unmatched_lines true
</source>

Last logs of fluentd before pushing stops:

2023-01-25 17:55:53 +0000 [info]: #0 [in_tail_container_logs] detected rotation of /var/log/containers/pod-68f6cb4bb4-vlfcq_default_gateway-rest-23c1ac09792e85465483b2d641e58fa6d5d8181924bc7b08f739779b2bf1d02f.log; waiting 1.0 seconds

2023-01-25 17:55:53 +0000 [info]: #0 [in_tail_container_logs] following tail of /var/log/containers/pod-68f6cb4bb4-vlfcq_default_gateway-rest-23c1ac09792e85465483b2d641e58fa6d5d8181924bc7b08f739779b2bf1d02f.log

When I restart fluentd then it starts to push all the logs that were missing. If they are still on the node.

We configured docker to produce log files with 500gb and we keep 1 rotated file.

kevinsookocheff-wf commented 1 year ago

We encountered the same error. I've been trying to reproduce without success so far. In our case we did a large upgrade from version 1.6.3 to 1.15.3.

Our failing deployment is td-agent with file tailing running as a Kubernetes daemonset.

Skip update_watcher because watcher has been already updated by other inotify event path="/var/log/containers/k8s-...-controller.log" inode=2883733 inode_in_pos_file=2883972
jcantrill commented 1 year ago

@ashie we have a customer who claims they applied the following patch and the issue goes away. I do not have a reliable way to reproduce the problem. We are using v1.14.6

--- in_tail.rb.org      2023-02-13 17:04:42.353997075 +0900
+++ in_tail.rb  2023-02-13 17:04:39.030016491 +0900
@@ -354,13 +354,15 @@

     def existence_path
       hash = {}
-      @tails.each_key {|target_info|
-        if @follow_inodes
-          hash[target_info.ino] = target_info
-        else
-          hash[target_info.path] = target_info
-        end
-      }
+      if @follow_inodes
+        @tails.each {|ino, tw|
+          hash[tw.ino] = TargetInfo.new(tw.path, tw.ino)
+        }
+      else
+        @tails.each {|path, tw|
+          hash[tw.path] = TargetInfo.new(tw.path, tw.ino)
+        }
+      end
       hash
     end

@@ -441,8 +443,11 @@

       begin
         target_info = TargetInfo.new(target_info.path, Fluent::FileWrapper.stat(target_info.path).ino)
-        @tails.delete(target_info)
-        @tails[target_info] = tw
+        if @follow_inodes
+          @tails[target_info.ino] = tw
+        else
+          @tails[target_info.path] = tw
+        end
         tw.on_notify
       rescue Errno::ENOENT, Errno::EACCES => e
         $log.warn "stat() for #{target_info.path} failed with #{e.class.name}. Drop tail watcher for now."
@@ -462,9 +467,17 @@
     def stop_watchers(targets_info, immediate: false, unwatched: false, remove_watcher: true)
       targets_info.each_value { |target_info|
         if remove_watcher
-          tw = @tails.delete(target_info)
+          if @follow_inodes
+            tw = @tails.delete(target_info.ino)
+          else
+            tw = @tails.delete(target_info.path)
+          end
         else
-          tw = @tails[target_info]
+          if @follow_inodes
+            tw = @tails[target_info.ino]
+          else
+            tw = @tails[target_info.path]
+          end
         end
         if tw
           tw.unwatched = unwatched
@@ -478,10 +491,19 @@
     end

     def close_watcher_handles
-      @tails.keys.each do |target_info|
-        tw = @tails.delete(target_info)
-        if tw
-          tw.close
+      if @follow_inodes
+        @tails.keys.each do |ino|
+          tw = @tails.delete(ino)
+          if tw
+            tw.close
+          end
+        end
+      else
+        @tails.keys.each do |path|
+          tw = @tails.delete(path)
+          if tw
+            tw.close
+          end
         end
       end
     end
@@ -500,26 +522,25 @@
       end

       rotated_target_info = TargetInfo.new(target_info.path, pe.read_inode)
-      rotated_tw = @tails[rotated_target_info]
-      new_target_info = target_info.dup

       if @follow_inodes
+        rotated_tw = @tails[rotated_target_info.ino]
+        new_target_info = target_info.dup
         new_position_entry = @pf[target_info]

         if new_position_entry.read_inode == 0
           # When follow_inodes is true, it's not cleaned up by refresh_watcher.
           # So it should be unwatched here explicitly.
           rotated_tw.unwatched = true
-          # Make sure to delete old key, it has a different ino while the hash key is same.
-          @tails.delete(rotated_target_info)
-          @tails[new_target_info] = setup_watcher(new_target_info, new_position_entry)
-          @tails[new_target_info].on_notify
+          @tails[new_target_info.ino] = setup_watcher(new_target_info, new_position_entry)
+          @tails[new_target_info.ino].on_notify
         end
       else
-        # Make sure to delete old key, it has a different ino while the hash key is same.
-        @tails.delete(rotated_target_info)
-        @tails[new_target_info] = setup_watcher(new_target_info, pe)
-        @tails[new_target_info].on_notify
+        rotated_tw = @tails[rotated_target_info.path]
+        new_target_info = target_info.dup
+
+        @tails[new_target_info.path] = setup_watcher(new_target_info, pe)
+        @tails[new_target_info.path].on_notify
       end
       detach_watcher_after_rotate_wait(rotated_tw, pe.read_inode) if rotated_tw
     end

--- position_file.rb.org        2023-02-13 15:02:14.671608849 +0900
+++ position_file.rb    2023-02-13 15:06:09.098223852 +0900
@@ -250,20 +250,6 @@
       end
     end

-    TargetInfo = Struct.new(:path, :ino) do
-      def ==(other)
-        return false unless other.is_a?(TargetInfo)
-        self.path == other.path
-      end
-
-      def hash
-        self.path.hash
-      end
-
-      def eql?(other)
-        return false unless other.is_a?(TargetInfo)
-        self.path == other.path
-      end
-    end
+    TargetInfo = Struct.new(:path, :ino)
   end
 end

our typical config is:

<source>
  @type tail
  @id container-input
  path "/var/log/pods/*/*/*.log"
  exclude_path ["/var/log/pods/openshift-logging_collector-*/*/*.log", "/var/log/pods/openshift-logging_elasticsearch-*/*/*.log", "/var/log/pods/openshift-logging_kibana-*/*/*.log", "/var/log/pods/openshift-logging_*/loki*/*.log", "/var/log/pods/*/*/*.gz", "/var/log/pods/*/*/*.tmp"]
  pos_file "/var/lib/fluentd/pos/es-containers.log.pos"
  follow_inodes true
  refresh_interval 5
  rotate_wait 5
  tag kubernetes.*
  read_from_head "true"
  skip_refresh_on_startup true
  @label @CONCAT
  <parse>
    @type regexp
    expression /^(?<@timestamp>[^\s]+) (?<stream>stdout|stderr) (?<logtag>[F|P]) (?<message>.*)$/
    time_key '@timestamp'
    keep_time_key true
  </parse>
</source>

One significant difference I do see with the original report of the issue and how we collect logs is the symlink. Collecting from /var/log/pods which is the preferred kubernetes location AFAIK does not utilize symlinks as was done with /var/log/containers.

jcantrill commented 1 year ago

@ashie I believe the issue is because the filesystem reuse of inodes and fluent using inode as the "key" for maintaining watches. Our customer provided the following where then position entry is converted to decimal:

-- Pos file
/var/log/pods/openshift-image-registry_node-ca-k6thz_78bc195f-591c-4af5-afb1-a31c601850e9/node-ca/0.log 18446744073709551616    195064732
/var/log/pods/openshift-image-registry_node-ca-k6thz_78bc195f-591c-4af5-afb1-a31c601850e9/node-ca/0.log 18446744073709551616    195064733
/var/log/pods/openshift-image-registry_node-ca-k6thz_78bc195f-591c-4af5-afb1-a31c601850e9/node-ca/0.log 58444023        195064732
/var/log/pods/openshift-image-registry_node-ca-k6thz_78bc195f-591c-4af5-afb1-a31c601850e9/node-ca/0.log 35228450        195064763

Note there are multiple entries with the same inode where one is identified as "unwatched" and the other is still open. The logic then closes the watch on the inode even though it is still in use

ashie commented 1 year ago

@jcantrill Thanks for your information! We'll check it.

parandiary commented 1 year ago

td-agent 4.4.2 fluentd 1.15.3 버전을 사용하고 있습니다.

td-agent --version
td-agent 4.4.2 fluentd 1.15.3 (e89092ce1132a933c12bb23fe8c9323c07ca81f5)

I'm tailing logs by mounting EFS on AWS EC2, and I can't track the tail after the log file is rotated. Here's the log when tracking fails:

--- success watch
2023-03-17 00:00:02 +0900 [info]: #0 [input_tail] following tail of /EFS/STG/app_log/app1/ims.log
--- fail watch
2023-03-18 00:00:02 +0900 [info]: #0 [input_tail] following tail of /EFS/STG/app_log/app1/ims.log
2023-03-18 00:00:02 +0900 [info]: #0 [input_tail] detected rotation of /EFS/STG/app_log/app1/ims.log; waiting 5 seconds
--- success watch
2023-03-19 00:00:02 +0900 [info]: #0 [input_tail] following tail of /EFS/STG/app_log/app1/ims.log
--- fail watch
2023-03-20 00:00:02 +0900 [info]: #0 [input_tail] following tail of /EFS/STG/app_log/app1/ims.log
2023-03-20 00:00:02 +0900 [info]: #0 [input_tail] detected rotation of /EFS/STG/app_log/app1/ims.log; waiting 5 seconds
-- success watch
2023-03-21 00:00:02 +0900 [info]: #0 [input_tail] following tail of /EFS/STG/app_log/app1/ims.log

I tried with the settings below, but it failed and I got an unwatched log after 7 minutes of detect rotation.

 enable_stat_watcher false
  refresh_interval 3s
  rotate_wait 1s
  read_from_head true
  follow_inodes true
  pos_file_compaction_interval 10m
2023-03-22 00:00:01 +0900 [debug]: #0 [input_tail] tailing paths: target = /EFS/STG/app_log/app1/ims.log | existing = /EFS/STG/app_log/app1/ims.log
2023-03-22 00:00:01 +0900 [info]: #0 [input_tail] following tail of /EFS/STG/app_log/app1/ims.log
2023-03-22 00:00:01 +0900 [info]: #0 [input_tail] detected rotation of /EFS/STG/app_log/app1/ims.log; waiting 1 seconds
2023-03-22 00:00:16 +0900 [debug]: #0 [input_tail] tailing paths: target = /EFS/STG/app_log/app1/ims.log | existing = /EFS/STG/app_log/app1/ims.log
2023-03-22 00:07:16 +0900 [debug]: #0 [input_tail] Remove unwatched line from pos_file: /EFS/STG/app_log/app1/ims.log   ffffffffffffffff        9a58588af4b55d9f

If you try with the settings below, the log is tracked normally, but there are concerns about performance. Please refer to the log and advise on performance.

open_on_every_update true

  enable_stat_watcher false
  refresh_interval 3s
  rotate_wait 1s
  read_from_head true
  follow_inodes true
  pos_file_compaction_interval 3m
  open_on_every_update true

log

2023-03-21 23:58:07 +0900 [info]: #0 [input_tail] Clean up the pos file
2023-03-22 00:00:41 +0900 [info]: #0 [input_tail] detected rotation of /EFS/PRD/app_log/app1/ims.log
2023-03-22 00:00:41 +0900 [info]: #0 [input_tail] following tail of /EFS/PRD/app_log/app1/ims.log
2023-03-22 00:00:44 +0900 [info]: #0 [input_tail] following tail of /EFS/PRD/app_log/app1/ims.log
2023-03-22 00:01:07 +0900 [info]: #0 [input_tail] Clean up the pos file
jcantrill commented 1 year ago

@ashie I have done some rudimentary investigation using the following:

I set up 4 separate terminals:

The basic testing I have done keeps the writer size to 512 and various the number of writers from 1-250. My summary of what I observed is in line with what I would expect: as you increase the files being tailed, fluent becomes CPU bound and is not able to predictably keep up with the load. In a containerized environment like kubernetes, it will miss full log rotations and segments of logs. Changing the position file compact interval from the default of 72h to 45s helps but it in no way can account for "missing watches".

Our customer claims fluend is closing these watches because of multiple entries in the position file but I do not believe that to be the case. It seems to me that we are simply running up against the limitations of ruby and fluent and the fact it is not able to efficiently process tailing a large number of files that are constantly being rotated. I do see evidence of times where there are "missing watches"; the number is less then the number of files producing logs but these are few and it eventually recovers.

I'm not convinced there is an issue other then hitting the performance limitations of fluentd

jcantrill commented 1 year ago

I'm not convinced there is an issue other then hitting the performance limitations of fluentd

@ashie I'm going to walk back my statement to some after applying the proposed patch and doing similar rudimentary testing. The patch demonstrates an improvement with "missed" file rotations. I can not speak, however, to whether:

kattz-kawa commented 1 year ago

we have a customer who claims they applied the following patch and the issue goes away. I do not have a reliable way to reproduce the problem. We are using v1.14.6

This is my proposed patch for the customer.

@jcantrill Thank you for validating this patch.

I'm going to walk back my statement to some after applying the proposed patch and doing similar rudimentary testing. The patch demonstrates an improvement with "missed" file rotations. I can not speak, however, to whether:

@ashie

This issue is observed by Jeff's primitive test code. Additionally this issue also is observed on the popular enviroment like as using 'logrotate' program. (such as k8s)

I attach the simple reproducer program using 'logrotate'.

the reproducible way is below:

Test environment:
  RHEL8.7 on KVM(2 vCPUs, 4GB memory)

Kernel version:
  Linux 4.18.0-425.3.1.el8.x86_64

I tested it on RHEL8.7(kernel-4.18.0-425.3.1.el8.x86_64) KVM, 
However this procedure could run even on another Linux since it runs as a container.
Please replace all "podman" string to "docker" in test_run.sh , this include in reproducer.tar.gz, 
if your linux has no podman command.

Steps to reproduce:

1. Untar 'reproducer.tar.gz' on your machine

# tar zxf reproducer.tar.gz

2. Run 'test_run.sh' script

# nohup bash test_run.sh &

3. Wait until the "The issue has happend." message is printed in the 'lrt.log'
  'lrt.log' will be generated on the current directory.
  Run 'tail -f' command to monitor that "The issue has happend" message is printed on 'lrt.log'.

    Ex) The issue has happened. Mon Mar 13 07:23:16 JST 2023

4. Confirm that fluentd has already closed '/var/log/pods/test/0.log' although the file still exists.

     1) Check INODE of '/var/log/pods/test/0.log'

        # podman exec -it reproducer /bin/bash
        # ls -li /var/log/pods/test/0.log

     2) Check Fluentd never track the '/var/log/pods/test/0.log'

        # lsof -p  | grep 
          => It's closed although Fluentd has to collect the '/var/log/pods/test/0.log'

     3) Check the position file if the '/var/log/pods/test/0.log' should be tracked

        # awk "{printf(\"%s\t%s\t%s\n\", \$1, strtonum(\"0x\"\$2), strtonum(\"0x\"\$3))}"  /var/lib/fluentd/pos/es-containers.log.pos | grep 
          => If the inode exists, it means the log file should be collected by Fluentd

        ex)
           /var/log/pods/test/0.log        20165   486593998

     4) Check if the '/var/log/pods/test/0.log' is not tracked any more

        # kill -s SIGCONT {logger.sh pid}
        => Resume to send test logs to the '/var/log/pods/test/0.log'

        However Fluent won't reopen/reread the '/var/log/pods/test/0.log'
        As far as I can see this issue happen if 'follow_inodes' set 'true'.

I will try to create a pull request on this issue later. Could you review it? I hope this attached program is helpful to you to validate my patch. reproducer.tar.gz

ashie commented 1 year ago

Thanks for your comment, and sorry for my late response. Although I think this issue should have high priority but I'm not yet able to check the detail because I'm bothered by dozens of trivial tasks...

I will try to create a pull request on this issue later. Could you review it?

Of course pull requests are welcome! In addition, discussing about the code here is hard, using a pull request is recommended for it.

we have a customer who claims they applied the following patch and the issue goes away. I do not have a reliable way to reproduce the problem. We are using v1.14.6

This is my proposed patch for the customer.

Interesting, but I have some concerns about it.

ashie commented 1 year ago
* Using inode as keys might have side effect, we might experience such case in the past.
  * I don't yet remember the detail. Probably one of side effect is log duplication.

On the other hand, certainly I also think that it is strange to manage with path in follow_inode true case. In addition, probably the case we experienced before is mixed other conditions. I'm digging out it.

garyzjq commented 1 year ago
* Using inode as keys might have side effect, we might experience such case in the past.
  * I don't yet remember the detail. Probably one of side effect is log duplication.

On the other hand, certainly I also think that it is strange to manage with path in follow_inode true case. In addition, probably the case we experienced before is mixed other conditions. I'm digging out it.

strongly +1. Besides, regarding to code structure, the in_tail code has lots of "if follow_inode ... else ..." logic, which makes the code hard to read. My feeling is this follow_inode parameter is a big behavior switch than other parameters ( enable_xxx_watcher, timeout setting, etc), different tracking in pos file, memory, even watcher lifecycle, etc. Sounds need better abstraction from code level to make code clean and more readable.

ashie commented 1 year ago

4190 describes a cause of in_tails's stall issue very clearly.

It's indeed one of major cause of this issue. I'm not sure whether other causes are still remained or not though.

kattz-kawa commented 1 year ago

4190 describes a cause of in_tails's stall issue very clearly.

It's indeed one of major cause of this issue. I'm not sure whether other causes are still remained or not though.

Thank you for the information! It looks like that we no longer face the issue with my reproduction script after applying the patch of #4190.

  • Using inode as keys might have side effect, we might experience such case in the past.
    • I don't yet remember the detail. Probably one of side effect is log duplication.

Our approach is based on below:

But We also can agree that some unexpected side effect might be caused by that since our patch would be the big change compared with #4190.

Our only concern for #4190 is if the patch might cause a new file handle leak issue. We don't know really if there is no case that tw.ino != ino can be true except for #4190 case. (Although I didn't face such a leak issue during the test on my local env)

k-keiichi-rh commented 1 year ago

On the other hand, certainly I also think that it is strange to manage with path in follow_inode true case. In addition, probably the case we experienced before is mixed other conditions. I'm digging out it.

strongly +1. Besides, regarding to code structure, the in_tail code has lots of "if follow_inode ... else ..." logic, which makes the code hard to read.

I have no objection the issue should be fixed as described in https://github.com/fluent/fluentd/issues/4190.

But mixing inode and path to manage the hashes makes us confused when follow_inodes is true. It seems like it's indicated by this long discussion in this ticket from early 2022. So I think this situation should be resolved. It may need a code refactoring for the "if follow_inode ... else ..." logic. But it will be another topic...

What do you think about this?

garyzjq commented 1 year ago

Reply to @kattz-kawa and @k-keiichi-rh

Ideally, when follow_inode is on, all things should be tracked by inode, instead of partial by inode, partial by path. So, I vote for your PR (I actually want to do a similar PR :) )

In recent days, I'm heavily testing your PR. It runs good with a high log throughput and high log file rotation rate (running in k8s , log rotation pattern : old log file will be renamed and new log file will be created with same name) My testing env is 20K rows / sec, 2KB / row, ~10 sec log rotation interval. No log dup, no log missing.

ashie commented 1 year ago

The reason why I didn't apply the patch in https://github.com/fluent/fluentd/issues/3614#issuecomment-1439046528 soon is that:

But when I read #4190, the question has been resolved. Yes, #4185 certainly resolves the issue because rotated inodes aren't included in @tails.

Now I'm basically positive to merging #4185, I'll recheck it in some perspective though. The main remaining concern is that I want to add tests for it (but it might be too hard).

On the other hand, I'm also considering to merge #4191 mainly for follow_inodes false case. It seems that both of them can coexist.

Our only concern for https://github.com/fluent/fluentd/issues/4190 is if the patch might cause a new file handle leak issue.

In my understanding, it doesn't leak handles, they will be scheduled to free by rotate_watcher in this case.

Anyway, thanks to you guys, we've found the road to solution. Thank you so much!

daipom commented 1 year ago

Thanks to the fix of #4185 and the description of #4190, I understood this problem as follows. (follow_inode true)

Before rotate

After rotate

https://github.com/fluent/fluentd/blob/dcc05e59e038ffef16ee2b510d3d9e34b23bbdfe/lib/fluent/plugin/in_tail.rb#L370-L386

The added_hash will be {inodeB => TargetInfo("container.log", inodeB)}. So, construct_watcher will be called to construct TargetInfo("container.log", inodeB).

https://github.com/fluent/fluentd/blob/dcc05e59e038ffef16ee2b510d3d9e34b23bbdfe/lib/fluent/plugin/in_tail.rb#L422-L448

This will overwrite @tails["container.log"] for container.log (inodeB). However, @tail["container.log"] was originally used for container.log (inodeA). The old TailWatcher (for container.log (inodeA)) can call update_watcher to detach @tail["container.log"]. This will result in the new rotate file being wrongly detached.

https://github.com/fluent/fluentd/blob/dcc05e59e038ffef16ee2b510d3d9e34b23bbdfe/lib/fluent/plugin/in_tail.rb#L502-L519

daipom commented 1 year ago

The problem with this scenario is that the keys of @tails are the paths even if following_inode is enabled. So, I too think the fix of #4185 will solve the problem in this scenario.

I would like to see more details.

daipom commented 1 year ago

About what we are talking about in #4185. I think the problem with the above scenario (https://github.com/fluent/fluentd/issues/3614#issuecomment-1578977365) is that refresh_watcher is trying to update the paths of the existing TailWatchers. This causes conflicts.

The existing TailWatchers will be updated by StatWatcher or TimerTrigger, so refresh_watcher shouldn't update them. (Is this correct?)

However, when follow_inodes is enabled, refresh_watcher determines the differences only in the inodes. It needs to also see path because the existing TailWatchers are updated based on its' paths, even when following_inodes is enabled.

For example, I think the following modifications may provide a partial solution.

diff --git a/lib/fluent/plugin/in_tail.rb b/lib/fluent/plugin/in_tail.rb
index fb29dd24..20b6cd4f 100644
--- a/lib/fluent/plugin/in_tail.rb
+++ b/lib/fluent/plugin/in_tail.rb
@@ -380,6 +380,11 @@ module Fluent::Plugin
       unwatched_hash = existence_paths_hash.reject {|key, value| target_paths_hash.key?(key)}
       added_hash = target_paths_hash.reject {|key, value| existence_paths_hash.key?(key)}

+      if @follow_inodes
+        unwatched_hash.reject! {|ino, target_info| existence_paths_hash.value?(target_info.path)}
+        added_hash.reject! {|ino, target_info| existence_paths_hash.value?(target_info.path)}
+      end
+
       stop_watchers(unwatched_hash, immediate: false, unwatched: true) unless unwatched_hash.empty?
       start_watchers(added_hash) unless added_hash.empty?
       @startup = false if @startup
kattz-kawa commented 1 year ago

@daipom

Thank you for suggesting another solution!

Just a quick reporting on my reproducer with applying your suggested patch, the result is not good. This issue happened. I'm digging the reason why this issue happened with your patch.

masaki-hatada commented 1 year ago

Hi @daipom san,

Thank you for the update. I got your idea. The idea of #4185 and #4191 is to prevent closing the watcher wrongly from update_watcher() side. But your idea is to prevent closing the watcher from refresh_watchers() side.

+        added_hash.reject! {|ino, target_info| existence_paths_hash.value?(target_info.path)}

Inode would be different after log rotation. It means that the new one wouldn't be removed from added_hash list. So start_watchers() would be called for the new one. start_watchers() will replace @tails[path] from the old one to the new one. It mean that update_watcher() would close the new one wrongly.

I believe this is the reason why @kattz-kawa faced the issue even after applying your patch.

daipom commented 1 year ago

@kattz-kawa @masaki-hatada Thanks so much for confirming the behavior! I'm sorry. I did an easy mistake. The patch I was thinking of is as follows. However, the tests don't succeed in this patch. I will reconsider this direction and this patch.

diff --git a/lib/fluent/plugin/in_tail.rb b/lib/fluent/plugin/in_tail.rb
index fb29dd24..d13407f0 100644
--- a/lib/fluent/plugin/in_tail.rb
+++ b/lib/fluent/plugin/in_tail.rb
@@ -380,6 +380,12 @@ module Fluent::Plugin
       unwatched_hash = existence_paths_hash.reject {|key, value| target_paths_hash.key?(key)}
       added_hash = target_paths_hash.reject {|key, value| existence_paths_hash.key?(key)}

+      if @follow_inodes
+        paths = existence_paths_hash.collect {|ino, target_info| target_info.path}
+        unwatched_hash.reject! {|ino, target_info| paths.any?(target_info.path)}
+        added_hash.reject! {|ino, target_info| paths.any?(target_info.path)}
+      end
+
       stop_watchers(unwatched_hash, immediate: false, unwatched: true) unless unwatched_hash.empty?
       start_watchers(added_hash) unless added_hash.empty?
       @startup = false if @startup
kattz-kawa commented 1 year ago

Hi @daipom

Thank you for deeply discussing this issue. I'm testing the patch in #3614(comment) last week.

It looks good for unexpected file close issue But the 'tw' leak issue probably occurred you know.

pid 413 = fluentd process

# lsof -p 413 -nP | grep "/var/log/pods/" | grep deleted
ruby-mri 413 root   10r      REG   0,64    24198 167792839 /var/log/pods/test/0.log.8 (deleted)
ruby-mri 413 root   22r      REG   0,64    20660 360740488 /var/log/pods/dummy102/0.log.8 (deleted)

You may have already noticed, but I leave this information in the comments just in case.