Open sophieyfang opened 3 years ago
This seems to be occurred with Out of Memory:
ERROR_OUTOFMEMORY
14 (0xE)
Not enough storage is available to complete this operation.
ref: https://docs.microsoft.com/en-us/windows/win32/debug/system-error-codes--0-499-
rate_limit
might be useful to prevent flood of consuming events in Security channel.
In the ReadMe.md, it describes ‘rate_limit’ as a way to
“”” NOTE: If you encountered CPU spike due to massively huge EventLog channel, rate_limit parameter may help you. Currently, this paramter can handle the multiples of 10 or -1(Winevt::EventLog::Subscribe::RATE_INFINITE). “””. It seems for CPU spike but not memory spike. But, I will ask client to give it a try.
Customer is generating > 2000 entries per second on the security channel. With rate_limit 200
, It still crashed with exception stated below. I can confirm that log.warn e.inspect
is added inside C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/plugin/buffer.rb
. But the logged exception doesn't give more useful info...
Also since customer uses rate_limit 200
, the > 2000 entries per second will probably cause a lot of backlog in the file buffers. But how could it cause the below exception is something i don't understand. I assume below exception is due to out of memory.
2021-02-08 01:41:47 -0800 [warn]: #0 emit transaction failed: error_class=IOError error="closed stream" location="C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/plugin/buffer/file_chunk.rb:82:in `pos'" tag="winevt.raw"
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/plugin/buffer/file_chunk.rb:82:in `pos'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/plugin/buffer/file_chunk.rb:82:in `rollback'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/plugin/buffer.rb:356:in `rescue in block in write'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/plugin/buffer.rb:348:in `block in write'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/plugin/buffer.rb:347:in `each'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/plugin/buffer.rb:347:in `write'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/compat/output.rb:374:in `block in handle_stream_simple'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/plugin/output.rb:888:in `write_guard'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/compat/output.rb:373:in `handle_stream_simple'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/plugin/output.rb:878:in `execute_chunking'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/plugin/output.rb:808:in `emit_buffered'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/event_router.rb:160:in `emit_events'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/event_router.rb:97:in `emit_stream'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluent-plugin-windows-eventlog-0.8.0/lib/fluent/plugin/in_windows_eventlog2.rb:341:in `on_notify_hash'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluent-plugin-windows-eventlog-0.8.0/lib/fluent/plugin/in_windows_eventlog2.rb:249:in `block (2 levels) in subscribe_channels'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/plugin_helper/timer.rb:80:in `on_timer'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/cool.io-1.7.0/lib/cool.io/loop.rb:88:in `run_once'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/cool.io-1.7.0/lib/cool.io/loop.rb:88:in `run'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start'
2021-02-08 01:41:47 -0800 [warn]: #0 C:/Program Files (x86)/Stackdriver/LoggingAgent/Main/lib/ruby/gems/2.6.0/gems/fluentd-1.11.2-x86-mingw32/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2021-02-08 01:41:47 -0800 [error]: #0 Unexpected error raised. Stopping the timer. title=:in_windows_eventlog_security error_class=IOError error="closed stream"
2021-02-08 01:41:47 -0800 [error]: #0 suppressed same stacktrace
2021-02-08 01:41:47 -0800 [error]: #0 Timer detached. title=:in_windows_eventlog_security
Here is the conf:
2021-02-05 14:52:42 -0800 [info]: using configuration file: <ROOT>
<filter winevt.raw>
@type record_transformer
enable_ruby
<record>
message ${record['Description']}
severity info
timestamp ${t = Time.parse(record['TimeCreated']); {'seconds' => t.tv_sec, 'nanos' => t.tv_nsec}}
</record>
</filter>
<filter **>
@type add_insert_ids
</filter>
<match **>
@type google_cloud
buffer_type "file"
buffer_path /var/log/google-fluentd/buffers
buffer_chunk_limit 512KB
flush_interval 5s
disable_retry_limit false
retry_limit 3
retry_wait 10
max_retry_wait 300
num_threads 8
use_grpc false
partial_success true
<buffer>
flush_mode interval
retry_type exponential_backoff
@type file
path /var/log/google-fluentd/buffers
flush_thread_count 8
flush_interval 5s
retry_forever false
retry_max_times 3
retry_max_interval 300
chunk_limit_size 512KB
</buffer>
</match>
<source>
@type windows_eventlog2
channels application,system,security
read_interval 2
rate_limit 200
tag "winevt.raw"
<storage>
@type "local"
persistent true
path ".\\Main\\pos\\winevtlog.pos"
</storage>
</source>
</ROOT>
Can you help us understanding how rate_limit
control the pace of ingesting eventlogs?
With great load of eventlog generation, how would rate_limit
help in such case? What side effect it might cause in the memory or cpu usage?
What side effect it might cause in the memory or cpu usage?
rate limit can cause consuming pause and this will cause CPU and memory usage decline. With rate_limit parameter, EvtNext is not called when rate limit is exceeded. On users side, just no events consumed phenomenon is observed.
Also since customer uses
rate_limit 200
, the > 2000 entries per second will probably cause a lot of backlog in the file buffers.
This shouldn't be occurred. I guess just discarded Windows EventLog which are not consumed from Fluentd Windows EventLog plugin. Windows EventLog is always written in capped and rotated channels.
One of our clients encountered this error. (Sorry that i can't reproduce it in my side.) The client has increased the memory from 8GB to 16 GB and now 32GB. But still it got errorCode:14 which is Out of Memory error code in Windows. Also It affects only Security channel. All other channels' logs flow in without trouble.. I looked through the code and this is the stack-trace that i think it might be happening during the crash(only security channel crashed). I wonder how could one single eventlog entry consumed that much memory considering the client has increased it from 8->16->32GB but still sees the error. Alongside there was always a memory spike when the crash happened
(Note: ordering is in invoking sequence)
Starting from subscribe.each
Then rb_winevt_subscribe_each is invoked by
subscribe.each
Then rb_winevt_subscribe_each_yield is invoked
Then rb_winevt_subscribe_message is invoked
Then get_description is invoked
Then get_message is invoked inside
get_description
Then EvtFormatMessage is invoked but returns
ERROR_OUTOFMEMORY
(ErrorCode: 14)Finally raise an error here
Error Message: