fluent / fluent-package-builder

td-agent (Fluentd) Building and Packaging System
Apache License 2.0
21 stars 23 forks source link

Windows: td-agent 4.5.2 Too many open files #621

Closed tkavak closed 4 months ago

tkavak commented 4 months ago

Describe the bug

Hello there! We use td-agent v4.5.2 (installed on windows server 2019/2022) to forward logs to fluentd(v1.16) container hosted in docker on linux VM Hers is the configuration of td-agent:

<source>
  @type tail
  path "#{%q[
    C:/TestApp/application*.log
    ]}"
  pos_file C:\opt\td-agent\TestApp\pos\TestApp_logs.pos
  tag *
  read_from_head true
  follow_inodes true
  <parse>
    @type none
  </parse>
</source>

<label @FLUENT_LOG>
  <match fluent.*>
    @type stdout
  </match>
</label>

<filter *.TestApp.**.*.log>
  @type record_transformer
  <record>
    message ${tag}, ${record["message"]}
  </record>
</filter>

<match **>
   @type forward
   <server>
     host {{ REMOTE_SERVER_HOST }}
     port 24284
   </server>
   tag TestApp

  <buffer tag,time>
    @type file
    path C:/opt/td-agent/TestApp/logs
    timekey 60m            # Flush the accumulated chunks every nminutes
    timekey_wait 10m       # Wait for n minutes before flushing
    timekey_use_utc true
    chunk_limit_size 256m
  </buffer>
</match>

For some time target fluentd container wasn't accessible and td-agent couldn't flush buffered chunks. We noticed following errors in the td-agent.log:

2024-02-15 10:18:44 +0300 [warn]: #0 emit transaction failed: error_class=Fluent::Plugin::Buffer::BufferOverflowError error="can't create buffer file for C:/opt/td-agent/TestApp/logs/buffer.*.log. Stop creating buffer files: error = Too many open files @ rb_sysopen - C:/opt/td-agent/TestApp/logs/buffer.b611667008980029b3596afadbefbdada.log" location="C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.3/lib/fluent/plugin/buffer/file_chunk.rb:289:in `rescue in create_new_chunk'" tag="d:.dev.Host.bin.Debug.Logs.app20240213.log"
2024-02-15 10:18:44 +0300 [error]: #0 unexpected error error_class=Errno::EMFILE error="Too many open files @ rb_sysopen - d:/dev/Host/bin/Debug/Logs/app20240213.log"

The application host which td-agent is following is in running state and keeps writing logs.

To Reproduce

Stop target fluentd container which should receive logs from td-agent td-agent installed on Windows server and test app should be in running state Around 2k buffer files were accumulated

Expected behavior

  1. Existing buffered chunks should be delivered to target fluentd container without loss once the connection can be established
  2. Newly generated logs by test app should be buffered by td-agent

Your Environment

- Fluentd version: v1.16
- TD Agent version:v.4.5.2
- fluent-plugin-s3 version: v.1.7.2
- Operating system: Windows 10/11; Windows server 2019/2022

Your Configuration

<source>
  @type tail
  path "#{%q[
    C:/TestApp/application*.log
    ]}"
  pos_file C:\opt\td-agent\TestApp\pos\TestApp_logs.pos
  tag *
  read_from_head true
  follow_inodes true
  <parse>
    @type none
  </parse>
</source>

<label @FLUENT_LOG>
  <match fluent.*>
    @type stdout
  </match>
</label>

<filter *.TestApp.**.*.log>
  @type record_transformer
  <record>
    message ${tag}, ${record["message"]}
  </record>
</filter>

<match **>
   @type forward
   <server>
     host {{ REMOTE_SERVER_HOST }}
     port 24284
   </server>
   tag TestApp

  <buffer tag,time>
    @type file
    path C:/opt/td-agent/TestApp/logs
    timekey 60m            # Flush the accumulated chunks every nminutes
    timekey_wait 10m       # Wait for n minutes before flushing
    timekey_use_utc true
    chunk_limit_size 256m
  </buffer>
</match>

Your Error Log

2024-02-15 10:18:44 +0300 [warn]: #0 emit transaction failed: error_class=Fluent::Plugin::Buffer::BufferOverflowError error="can't create buffer file for C:/opt/td-agent/TestApp/logs/buffer.*.log. Stop creating buffer files: error = Too many open files @ rb_sysopen - C:/opt/td-agent/TestApp/logs/buffer.b611667008980029b3596afadbefbdada.log" location="C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.3/lib/fluent/plugin/buffer/file_chunk.rb:289:in `rescue in create_new_chunk'" tag="d:.dev.Host.bin.Debug.Logs.app20240213.log"
2024-02-15 10:18:44 +0300 [error]: #0 unexpected error error_class=Errno::EMFILE error="Too many open files @ rb_sysopen - d:/dev/Host/bin/Debug/Logs/app20240213.log"

Additional context

Please confirm if the td-agent works as expected or suggest fixes or workarounds to deliver logs successfully.
The option to cleanup pos and log folders for td-agent is not applicable due to reasons:

1. Some logs can be duplicated at the target end
2. Some original log files can be cleaned up by the application host, so the logs entries will be lost
daipom commented 4 months ago

From https://github.com/fluent/fluentd/issues/4404

Around 2k buffer files were accumulated

It seems that the embedded Ruby can only open up to 2046 files on Windows.

On Linux, the number can be changed by ulimit or LimitNOFILE. (the package changes it to 65536 by default). However, on Windows, the number does not seem to be changed.

So, please reduce the number of buffer files by adjusting the setting.