fluent / fluentd

Fluentd: Unified Logging Layer (project under CNCF)
https://www.fluentd.org
Apache License 2.0
12.91k stars 1.34k forks source link

td-agent becomes deadlocked or hung #1588

Closed ghost closed 7 years ago

ghost commented 7 years ago

Check CONTRIBUTING guideline first and here is the list to help us investigate the problem.

Treasure Data (http://www.treasure-data.com/) provides cloud based data

analytics platform, which easily stores and processes data from td-agent.

FREE plan is also provided.

@see http://docs.fluentd.org/articles/http-to-td

#

This section matches events whose tag is td.DATABASE.TABLE

<match td..> type tdlog apikey

auto_create_table buffer_type file buffer_path /var/log/td-agent/buffer/td

match tag=debug.** and dump to console

<match debug.**> type stdout

Source descriptions:

built-in TCP input

@see http://docs.fluentd.org/articles/in_forward

type forward port 24224 bind 0.0.0.0

built-in UNIX socket input

type unix

live debugging agent

type debug_agent bind 127.0.0.1 port 24230

elasticsearch_match.conf

<match *.> type copy

type elasticsearch hosts 10.1.197.31,10.1.197.14,10.1.197.114,10.1.197.224 port 9200 user elastic password changeme scheme https logstash_dateformat %Y.%m.%d.%H disable_retry_limit true max_retry_wait 300s ca_file /etc/td-agent/ca.crt ssl_verify false tag_key @log_name logstash_format true reload_connections true reload_on_failure true flush_interval 10s num_threads 2 buffer_type memory buffer_chunk_limit 256m buffer_queue_limit 128 retry_wait 15s reconnect_on_error true

example source file, we have about 137 source files

type tail tag tail.httpd_access_log path /volr//httpd/access.log exclude_path /volr/ptl11navres10/httpd/access.log format /\s(?[^\s]+)\s(?[^\s]+)\s(?[^\s]+)\s[(?

- Your problem explanation. If you have an error logs, write it together.
when i start td-agent it will startup just fine and even start to send logs to elastic. However I noticed the logs would basically stop producing information after just a few minutes.  Looking at the process it basically goes to sleep with nothing happening and td-agent becomes unresponsive. I have observed a few times that it would even restart. Unfortunately I dont actually see any errors. I tried to turn up debugging and noticed tracing started but even after a restart or the issue noting basically happens.

strace output

ps -ef | grep td-agent td-agent 22733 1 0 15:06 ? 00:00:00 /opt/td-agent/embedded/bin/ruby /usr/sbin/td-agent --log /var/log/td-agent/td-agent.log --use-v1-config --group td-agent --daemon /var/run/td-agent/td-agent.pid root 27989 27959 0 15:33 pts/2 00:00:00 tail -f /var/log/td-agent/td-agent.log td-agent 30746 22733 52 15:42 ? 00:02:32 /opt/td-agent/embedded/bin/ruby /usr/sbin/td-agent --log /var/log/td-agent/td-agent.log --use-v1-config --group td-agent --daemon /var/run/td-agent/td-agent.pid root 31581 26937 0 15:46 pts/1 00:00:00 grep td-agent

strace -f -p 30746 Process 30746 attached with 40 threads [pid 30786] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49393, NULL <unfinished ...> [pid 30785] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49418, NULL <unfinished ...> [pid 30784] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30783] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49415, NULL <unfinished ...> [pid 30782] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49417, NULL <unfinished ...> [pid 30781] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49416, NULL <unfinished ...> [pid 30780] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49412, NULL <unfinished ...> [pid 30779] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49414, NULL <unfinished ...> [pid 30778] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49413, NULL <unfinished ...> [pid 30777] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49395, NULL <unfinished ...> [pid 30776] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49397, NULL <unfinished ...> [pid 30775] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49396, NULL <unfinished ...> [pid 30774] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49390, NULL <unfinished ...> [pid 30773] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49388, NULL <unfinished ...> [pid 30772] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49389, NULL <unfinished ...> [pid 30771] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49391, NULL <unfinished ...> [pid 30770] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49384, NULL <unfinished ...> [pid 30769] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49405, NULL <unfinished ...> [pid 30768] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49410, NULL <unfinished ...> [pid 30767] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49411, NULL <unfinished ...> [pid 30766] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49383, NULL <unfinished ...> [pid 30765] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49387, NULL <unfinished ...> [pid 30764] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49409, NULL <unfinished ...> [pid 30763] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49408, NULL <unfinished ...> [pid 30762] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49407, NULL <unfinished ...> [pid 30761] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49406, NULL <unfinished ...> [pid 30760] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49394, NULL <unfinished ...> [pid 30759] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49401, NULL <unfinished ...> [pid 30758] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49399, NULL <unfinished ...> [pid 30757] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49400, NULL <unfinished ...> [pid 30756] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49403, NULL <unfinished ...> [pid 30755] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49402, NULL <unfinished ...> [pid 30754] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49404, NULL <unfinished ...> [pid 30753] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49392, NULL <unfinished ...> [pid 30752] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49398, NULL <unfinished ...> [pid 30751] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49385, NULL <unfinished ...> [pid 30750] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49386, NULL <unfinished ...> [pid 30749] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49382, NULL <unfinished ...> [pid 30748] restart_syscall(<... resuming interrupted call ...> <unfinished ...> [pid 30746] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49380, NULL <unfinished ...> [pid 30786] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30785] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30783] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30782] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30781] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30780] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30779] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30778] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30777] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30776] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30775] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30774] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30773] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30772] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30771] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30769] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30768] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30766] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30764] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30763] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30761] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30760] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30759] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30758] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30757] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30756] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30762] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30765] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30767] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30755] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30754] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30753] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30752] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30751] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30750] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30749] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30746] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30786] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30785] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30783] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30782] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30781] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30780] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30779] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30778] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30777] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30776] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30775] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30774] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30773] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30772] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30771] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30770] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 30769] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30768] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30767] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30766] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30765] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30764] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30763] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30762] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30761] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30760] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30759] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30758] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30757] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30756] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30755] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30754] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30753] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30752] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30751] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30750] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30749] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30746] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30770] futex(0x7fda21849044, FUTEX_WAIT_PRIVATE, 49419, NULL <unfinished ...> [pid 30748] <... restart_syscall resumed> ) = 0 [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) [pid 30748] poll([{fd=6, events=POLLIN}], 1, 100^CProcess 30746 detached Process 30748 detached <detached ...> Process 30749 detached Process 30750 detached Process 30751 detached Process 30752 detached Process 30753 detached Process 30754 detached Process 30755 detached Process 30756 detached Process 30757 detached Process 30758 detached Process 30759 detached Process 30760 detached Process 30761 detached Process 30762 detached Process 30763 detached Process 30764 detached Process 30765 detached Process 30766 detached Process 30767 detached Process 30768 detached Process 30769 detached Process 30770 detached Process 30771 detached Process 30772 detached Process 30773 detached Process 30774 detached Process 30775 detached Process 30776 detached Process 30777 detached Process 30778 detached Process 30779 detached Process 30780 detached Process 30781 detached Process 30782 detached Process 30783 detached Process 30784 detached Process 30785 detached Process 30786 detached

an lsof of the process

ruby 30746 td-agent cwd DIR 8,2 4096 138301 /home/wpaul ruby 30746 td-agent rtd DIR 8,2 4096 2 / ruby 30746 td-agent txt REG 8,2 11351 132085 /opt/td-agent/embedded/bin/ruby ruby 30746 td-agent mem REG 8,2 43928 268290 /lib64/libcrypt-2.12.so ruby 30746 td-agent mem REG 8,2 12776 268289 /lib64/libfreebl3.so ruby 30746 td-agent mem REG 8,2 161704 262153 /lib64/ld-2.12.so ruby 30746 td-agent mem REG 8,2 1930416 262160 /lib64/libc-2.12.so ruby 30746 td-agent mem REG 8,2 23088 262150 /lib64/libdl-2.12.so ruby 30746 td-agent mem REG 8,2 146592 262236 /lib64/libpthread-2.12.so ruby 30746 td-agent mem REG 8,2 47760 262271 /lib64/librt-2.12.so ruby 30746 td-agent mem REG 8,2 600048 262312 /lib64/libm-2.12.so ruby 30746 td-agent mem REG 8,2 706107 264832 /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/oj-2.18.1/lib/oj/oj.so ruby 30746 td-agent mem REG 8,2 297519 135296 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/bigdecimal.so ruby 30746 td-agent mem REG 8,2 199174 135399 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/zlib.so ruby 30746 td-agent mem REG 8,2 78737 141282 /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/http_parser.rb-0.6.0/lib/ruby_http_parser.so ruby 30746 td-agent mem REG 8,2 14029 135305 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/digest/sha1.so ruby 30746 td-agent mem REG 8,2 12853 135303 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/digest/md5.so ruby 30746 td-agent mem REG 8,2 66432 262366 /lib64/libnss_files-2.12.so ruby 30746 td-agent mem REG 8,2 40070 135301 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/digest.so ruby 30746 td-agent mem REG 8,2 93667 134331 /opt/td-agent/embedded/lib/libz.so.1.2.8 ruby 30746 td-agent mem REG 8,2 2516011 134207 /opt/td-agent/embedded/lib/libcrypto.so.1.0.0 ruby 30746 td-agent mem REG 8,2 496894 134303 /opt/td-agent/embedded/lib/libssl.so.1.0.0 ruby 30746 td-agent mem REG 8,2 1211749 135384 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/openssl.so ruby 30746 td-agent mem REG 8,2 235710 139077 /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/cool.io-1.4.6/lib/cool.io_ext.so ruby 30746 td-agent mem REG 8,2 39628 139079 /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/cool.io-1.4.6/lib/iobuffer_ext.so ruby 30746 td-agent mem REG 8,2 367695 264269 /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/msgpack-1.0.3/lib/msgpack/msgpack.so ruby 30746 td-agent mem REG 8,2 518997 135394 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/socket.so ruby 30746 td-agent mem REG 8,2 880358 135299 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/date_core.so ruby 30746 td-agent mem REG 8,2 66184 135396 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/strscan.so ruby 30746 td-agent mem REG 8,2 150883 398686 /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/yajl-ruby-1.3.0/lib/yajl/yajl.so ruby 30746 td-agent mem REG 8,2 160267 135377 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/json/ext/generator.so ruby 30746 td-agent mem REG 8,2 15332 135362 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/enc/utf_32le.so ruby 30746 td-agent mem REG 8,2 15308 135361 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/enc/utf_32be.so ruby 30746 td-agent mem REG 8,2 16788 135360 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/enc/utf_16le.so ruby 30746 td-agent mem REG 8,2 16660 135359 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/enc/utf_16be.so ruby 30746 td-agent mem REG 8,2 71053 135378 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/json/ext/parser.so ruby 30746 td-agent mem REG 8,2 95801 135395 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/stringio.so ruby 30746 td-agent mem REG 8,2 10369 135368 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/fcntl.so ruby 30746 td-agent mem REG 8,2 33832 135367 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/etc.so ruby 30746 td-agent mem REG 8,2 56640 135398 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/thread.so ruby 30746 td-agent mem REG 8,2 13986 135356 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/enc/trans/transdb.so ruby 30746 td-agent mem REG 8,2 12576 135314 /opt/td-agent/embedded/lib/ruby/2.1.0/x86_64-linux/enc/encdb.so ruby 30746 td-agent mem REG 8,2 99174448 672927 /usr/lib/locale/locale-archive ruby 30746 td-agent mem REG 8,2 9458032 134300 /opt/td-agent/embedded/lib/libruby.so.2.1.0 ruby 30746 td-agent mem REG 8,2 266219 134251 /opt/td-agent/embedded/lib/libjemalloc.so.2 ruby 30746 td-agent 0r CHR 1,3 0t0 3857 /dev/null ruby 30746 td-agent 1w CHR 1,3 0t0 3857 /dev/null ruby 30746 td-agent 2w CHR 1,3 0t0 3857 /dev/null ruby 30746 td-agent 3r REG 0,3 0 19040510 /proc/sys/vm/overcommit_memory ruby 30746 td-agent 4r REG 0,3 0 19040510 /proc/sys/vm/overcommit_memory ruby 30746 td-agent 5r REG 0,3 0 19040510 /proc/sys/vm/overcommit_memory ruby 30746 td-agent 6r FIFO 0,8 0t0 21655354 pipe ruby 30746 td-agent 7w FIFO 0,8 0t0 21655354 pipe ruby 30746 td-agent 8r FIFO 0,8 0t0 21655355 pipe ruby 30746 td-agent 9w FIFO 0,8 0t0 21655355 pipe ruby 30746 td-agent 10w REG 8,2 12588864 922172 /var/log/td-agent/td-agent.log ruby 30746 td-agent 11u REG 8,2 142 922212 /var/log/td-agent/af_batch_server_log.pos ruby 30746 td-agent 12u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 13r DIR 0,10 0 1 inotify ruby 30746 td-agent 14r REG 8,16 3062691 10092880 /volr/removed_for_security/jboss/server.log ruby 30746 td-agent 15r REG 8,16 3062761 10354701 /volr/removed_for_security/jboss/server.log ruby 30746 td-agent 16u REG 8,2 570 929034 /var/log/td-agent/afpweb_test_bestbuy.com_access_log.pos ruby 30746 td-agent 17u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 18r DIR 0,10 0 1 inotify ruby 30746 td-agent 19r REG 8,16 1868990 3014679 /volr/removed_for_security/httpd/mint.test.bestbuy.com-access.log ruby 30746 td-agent 20r REG 8,16 1869776 3014678 /volr/removed_for_security/httpd/preview.test.bestbuy.com-access.log ruby 30746 td-agent 21r REG 8,16 5894322 3014680 /volr/removed_for_security/httpd/pcms.test.bestbuy.com-access.log ruby 30746 td-agent 22r REG 8,16 1868043 1835030 /volr/removed_for_security/httpd/mint.test.bestbuy.com-access.log ruby 30746 td-agent 23r REG 8,16 1868526 1835029 /volr/removed_for_security/httpd/preview.test.bestbuy.com-access.log ruby 30746 td-agent 24r REG 8,16 6137113 1835031 /volr/removed_for_security/httpd/pcms.test.bestbuy.com-access.log ruby 30746 td-agent 25u REG 8,2 564 929035 /var/log/td-agent/afpweb_test_bestbuy.com_error_log.pos ruby 30746 td-agent 26u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 27r DIR 0,10 0 1 inotify ruby 30746 td-agent 28r REG 8,16 494 3014664 /volr/removed_for_security/httpd/preview.test.bestbuy.com-error.log ruby 30746 td-agent 29r REG 8,16 190 3014666 /volr/removed_for_security/httpd/mint.test.bestbuy.com-error.log ruby 30746 td-agent 30r REG 8,16 810 3014667 /volr/removed_for_security/httpd/pcms.test.bestbuy.com-error.log ruby 30746 td-agent 31r REG 8,16 190 1835010 /volr/removed_for_security/httpd/preview.test.bestbuy.com-error.log ruby 30746 td-agent 32r REG 8,16 190 1835012 /volr/removed_for_security/httpd/mint.test.bestbuy.com-error.log ruby 30746 td-agent 33r REG 8,16 1261 1835013 /volr/removed_for_security/httpd/pcms.test.bestbuy.com-error.log ruby 30746 td-agent 34u REG 8,2 192 922233 /var/log/td-agent/ap_webapp_application.pos ruby 30746 td-agent 35u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 36r DIR 0,10 0 1 inotify ruby 30746 td-agent 37r REG 8,16 73594 9175068 /volr/removed_for_security/bbscant-pub01/ap-webapp-application.log ruby 30746 td-agent 38r REG 8,16 74391 2490386 /volr/removed_for_security/bbscant-pub01/ap-webapp-application.log ruby 30746 td-agent 39u REG 8,2 0 922298 /var/log/td-agent/apid_application.pos ruby 30746 td-agent 40u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 41u REG 8,2 176 922303 /var/log/td-agent/sts_application.pos ruby 30746 td-agent 42u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 43r DIR 0,10 0 1 inotify ruby 30746 td-agent 44r REG 8,16 103723 13238284 /volr/removed_for_security/bbstst-app01/sts_application.log ruby 30746 td-agent 45r REG 8,16 86664 11141125 /volr/removed_for_security/bbstst-app01/sts_application.log ruby 30746 td-agent 46u REG 8,2 72 922586 /var/log/td-agent/atgpreview_server_log.pos ruby 30746 td-agent 47u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 48r DIR 0,10 0 1 inotify ruby 30746 td-agent 49r REG 8,16 2485122 12058652 /volr/removed_for_security/jboss/server.log ruby 30746 td-agent 50u REG 8,2 182 922750 /var/log/td-agent/basket_application.pos ruby 30746 td-agent 51u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 52r DIR 0,10 0 1 inotify ruby 30746 td-agent 53r REG 8,16 1294909 1966094 /volr/removed_for_security/bbbskt-app01/basket-application.log ruby 30746 td-agent 54r REG 8,16 1438508 2097172 /volr/removed_for_security/bbbskt-app01/basket-application.log ruby 30746 td-agent 55u REG 8,2 178 922875 /var/log/td-agent/bbact_dep01_beagle_event_log.pos ruby 30746 td-agent 56u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 57r DIR 0,10 0 1 inotify ruby 30746 td-agent 58r REG 8,16 5605233 14286870 /volr/removed_for_security/bbactt-dep01/beagle-event-log.log ruby 30746 td-agent 59r REG 8,16 5603891 12582939 /volr/removed_for_security/bbactt-dep01/beagle-event-log.log ruby 30746 td-agent 60u REG 8,2 178 922876 /var/log/td-agent/bbactt-app01_beagle_event.pos ruby 30746 td-agent 61u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 62r DIR 0,10 0 1 inotify ruby 30746 td-agent 63r REG 8,16 562370 1179657 /volr/removed_for_security/bbactt-app01/beagle-event-log.log ruby 30746 td-agent 64u REG 8,2 188 922890 /var/log/td-agent/bbactt-app01_beagle_internal_debug.pos ruby 30746 td-agent 65u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 66r DIR 0,10 0 1 inotify ruby 30746 td-agent 67r REG 8,16 791406 1179658 /volr/removed_for_security/bbactt-app01/beagle-internal-debug.log ruby 30746 td-agent 68u REG 8,2 178 922891 /var/log/td-agent/bbapdt-app01_apid_application.pos ruby 30746 td-agent 69u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 70r DIR 0,10 0 1 inotify ruby 30746 td-agent 71r REG 8,16 67032 5505029 /volr/removed_for_security/bbapdt-app01/apid_application.log ruby 30746 td-agent 72u REG 8,2 150 922892 /var/log/td-agent/bbatlt_app01_gc_log.pos ruby 30746 td-agent 73u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 74r DIR 0,10 0 1 inotify ruby 30746 td-agent 75r REG 8,16 498 11927581 /volr/removed_for_security/bbatlt-app01/gc.log ruby 30746 td-agent 76u REG 8,2 174 922896 /var/log/td-agent/bbcgrt-app01_customer_graph.pos ruby 30746 td-agent 77u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 78r DIR 0,10 0 1 inotify ruby 30746 td-agent 79r REG 8,16 12881059 4063263 /volr/removed_for_security/bbcgrt-app01/customer-graph.log ruby 30746 td-agent 80r REG 8,16 12878433 14417947 /volr/removed_for_security/bbcgrt-app01/customer-graph.log ruby 30746 td-agent 81u REG 8,2 170 922903 /var/log/td-agent/bbcgrt-bch01_cgraph_batch.pos ruby 30746 td-agent 82u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 83r DIR 0,10 0 1 inotify ruby 30746 td-agent 84r REG 8,16 11230542 11796499 /volr/removed_for_security/bbcgrt-bch01/cgraph-batch.log ruby 30746 td-agent 85r REG 8,16 4987011 11272203 /volr/removed_for_security/bbcgrt-bch01/cgraph-batch.log ruby 30746 td-agent 86u REG 8,2 152 922906 /var/log/td-agent/bbcgwyt-app01_gc.pos ruby 30746 td-agent 87u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 88r DIR 0,10 0 1 inotify ruby 30746 td-agent 89r REG 8,16 1000 393227 /volr/removed_for_security/bbcgwyt-app01/gc.log ruby 30746 td-agent 90r REG 8,16 1004 15597573 /volr/removed_for_security/bbcgwyt-app01/gc.log ruby 30746 td-agent 91u REG 8,2 550 922922 /var/log/td-agent/+.pos ruby 30746 td-agent 92u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 93u REG 8,2 550 922922 /var/log/td-agent/+.pos ruby 30746 td-agent 94u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 95r DIR 0,10 0 1 inotify ruby 30746 td-agent 96r REG 8,16 408747 4063265 /volr/removed_for_security/bbcrlt-app01/coral-request.log ruby 30746 td-agent 97r REG 8,16 1373952 524328 /volr/removed_for_security/bbcrlt-app01/coral-request.log ruby 30746 td-agent 98u REG 8,2 182 922933 /var/log/td-agent/button_application.pos ruby 30746 td-agent 99u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 100r DIR 0,10 0 1 inotify ruby 30746 td-agent 101r REG 8,16 2932533 15597586 /volr/removed_for_security/bbbtnt-app01/button-application.log ruby 30746 td-agent 102r REG 8,16 82754846 5767195 /volr/removed_for_security/bbbtnt-app01/button-application.log ruby 30746 td-agent 103u REG 8,2 92 922940 /var/log/td-agent/calligator_ipt1_int.pos ruby 30746 td-agent 104u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 105r DIR 0,10 0 1 inotify ruby 30746 td-agent 106u REG 8,2 194 923048 /var/log/td-agent/cargo_agg_shipping_agg_application_log.pos ruby 30746 td-agent 107u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 108r DIR 0,10 0 1 inotify ruby 30746 td-agent 109r REG 8,16 5979816616 15073283 /volr/removed_for_security/bbsdwt-agg01/shipping-agg-application.log ruby 30746 td-agent 110r REG 8,16 5979802592 1310725 /volr/removed_for_security/bbsdwt-agg01/shipping-agg-application.log ruby 30746 td-agent 111u REG 8,2 198 923174 /var/log/td-agent/cargo_shipping_application.pos ruby 30746 td-agent 112u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 113r DIR 0,10 0 1 inotify ruby 30746 td-agent 114r REG 8,16 2965283345 14680079 /volr/removed_for_security/bbsdwt-app01/cargo-shipping-application.log ruby 30746 td-agent 115r REG 8,16 2955292887 14417934 /volr/removed_for_security/bbsdwt-app01/cargo-shipping-application.log ruby 30746 td-agent 116u REG 8,2 164 923197 /var/log/td-agent/cart_cst_log.pos ruby 30746 td-agent 117u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 118r DIR 0,10 0 1 inotify ruby 30746 td-agent 119r REG 8,16 3501550 11534349 /volr/removed_for_security/bbcartt-app01/cst_cart.log ruby 30746 td-agent 120r REG 8,16 3678045 3145738 /volr/removed_for_security/bbcartt-app01/cst_cart.log ruby 30746 td-agent 121u REG 8,2 182 923198 /var/log/td-agent/catalog_aggregator.pos ruby 30746 td-agent 122u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 123r DIR 0,10 0 1 inotify ruby 30746 td-agent 124r REG 8,16 2745067 6029329 /volr/removed_for_security/bbcagt-app01/catalog-aggregator.log ruby 30746 td-agent 125r REG 8,16 2729487 7995430 /volr/removed_for_security/bbcagt-app01/catalog-aggregator.log ruby 30746 td-agent 126u REG 8,2 142 923441 /var/log/td-agent/cf_batch_server_log.pos ruby 30746 td-agent 127u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 128r DIR 0,10 0 1 inotify ruby 30746 td-agent 129r REG 8,16 17699593 4849681 /volr/removed_for_security/jboss/server.log ruby 30746 td-agent 130r REG 8,16 17194053 16252950 /volr/removed_for_security/jboss/server.log ruby 30746 td-agent 131u REG 8,2 386 929036 /var/log/td-agent/cfpweb_test_bestbuy.com_access.pos ruby 30746 td-agent 132u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 133r DIR 0,10 0 1 inotify ruby 30746 td-agent 134r REG 8,16 2362437 131096 /volr/removed_for_security/httpd/olsapp.test.bestbuy.com-access.log ruby 30746 td-agent 135r REG 8,16 2232729 131097 /volr/removed_for_security/httpd/reprice.test.bestbuy.com-access.log ruby 30746 td-agent 136r REG 8,16 2362244 6160397 /volr/removed_for_security/httpd/olsapp.test.bestbuy.com-access.log ruby 30746 td-agent 137r REG 8,16 2231483 6160398 /volr/removed_for_security/httpd/reprice.test.bestbuy.com-access.log ruby 30746 td-agent 138u REG 8,2 382 929037 /var/log/td-agent/cfpweb_test_bestbuy.com_error.pos ruby 30746 td-agent 139u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 140r DIR 0,10 0 1 inotify ruby 30746 td-agent 141r REG 8,16 190 131082 /volr/removed_for_security/httpd/reprice.test.bestbuy.com-error.log ruby 30746 td-agent 142r REG 8,16 190 131076 /volr/removed_for_security/httpd/olsapp.test.bestbuy.com-error.log ruby 30746 td-agent 143r REG 8,16 190 6160391 /volr/removed_for_security/httpd/reprice.test.bestbuy.com-error.log ruby 30746 td-agent 144r REG 8,16 190 6160390 /volr/removed_for_security/httpd/olsapp.test.bestbuy.com-error.log ruby 30746 td-agent 145u REG 8,2 219 923465 /var/log/td-agent/cgraph_source.pos ruby 30746 td-agent 146u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 147r DIR 0,10 0 1 inotify ruby 30746 td-agent 148r REG 8,16 264036 8651918 /volr/removed_for_security/cgraph/system.log ruby 30746 td-agent 149r REG 8,16 239476 8257537 /volr/removed_for_security/cgraph/system.log ruby 30746 td-agent 150r REG 8,16 177096 7864327 /volr/removed_for_security/cgraph/system.log ruby 30746 td-agent 151u REG 8,2 365 923476 /var/log/td-agent/cgraph_system.pos ruby 30746 td-agent 152u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 153r DIR 0,10 0 1 inotify ruby 30746 td-agent 154r REG 8,16 1114404 4980738 /volr/removed_for_security/cgraph/system.log ruby 30746 td-agent 155r REG 8,16 309535 14942237 /volr/removed_for_security/cgraph/system.log ruby 30746 td-agent 156r REG 8,16 435708 9569628 /volr/removed_for_security/cgraph/system.log ruby 30746 td-agent 157r REG 8,16 655186 2097173 /volr/removed_for_security/cgraph/system.log ruby 30746 td-agent 158r REG 8,16 1224150 12713996 /volr/removed_for_security/cgraph/system.log ruby 30746 td-agent 159u REG 8,2 184 923478 /var/log/td-agent/checkout_application_log.pos ruby 30746 td-agent 160u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 161r DIR 0,10 0 1 inotify ruby 30746 td-agent 162r REG 8,16 2101072 6684684 /volr/removed_for_security/bbchkt-app01/cto-agg-application.log ruby 30746 td-agent 163r REG 8,16 3287423 3932168 /volr/removed_for_security/bbchkt-app01/cto-agg-application.log ruby 30746 td-agent 164u REG 8,2 550 922922 /var/log/td-agent/+.pos ruby 30746 td-agent 165u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 166r DIR 0,10 0 1 inotify ruby 30746 td-agent 167r REG 8,16 5285922 10879012 /volr/removed_for_security/bbciat-app01/cia-webapp-application.log ruby 30746 td-agent 168r REG 8,16 5069249 14155825 /volr/removed_for_security/bbciat-app01/cia-webapp-application.log ruby 30746 td-agent 169u REG 8,2 75 923481 /var/log/td-agent/cia_webapp_gc.pos ruby 30746 td-agent 170u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 171r DIR 0,10 0 1 inotify ruby 30746 td-agent 172r REG 8,16 1984 14155778 /volr/removed_for_security/bbciat-app01/gc.log ruby 30746 td-agent 173u REG 8,2 550 922922 /var/log/td-agent/+.pos ruby 30746 td-agent 174u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 175r DIR 0,10 0 1 inotify ruby 30746 td-agent 176r REG 8,16 686677 10879013 /volr/removed_for_security/bbciat-app01/cia-webapp-statistics.log ruby 30746 td-agent 177r REG 8,16 654624 14155826 /volr/removed_for_security/bbciat-app01/cia-webapp-statistics.log ruby 30746 td-agent 178u REG 8,2 210 923487 /var/log/td-agent/clipper_availability_application.pos ruby 30746 td-agent 179u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 180r DIR 0,10 0 1 inotify ruby 30746 td-agent 181r REG 8,16 2575891 5111830 /volr/removed_for_security/bbavlt-app01/clipper-availability-application.log ruby 30746 td-agent 182r REG 8,16 2065243 5111834 /volr/removed_for_security/bbavlt-app01/clipper-availability-application.log ruby 30746 td-agent 183u REG 8,2 232 923498 /var/log/td-agent/clipper_availability_calculator_application.pos ruby 30746 td-agent 184u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 185r DIR 0,10 0 1 inotify ruby 30746 td-agent 186r REG 8,16 317765017 11141143 /volr/removed_for_security/bbcalt-app01/clipper-availability-calculator-application.log ruby 30746 td-agent 187r REG 8,16 548338972 13238295 /volr/removed_for_security/bbcalt-app01/clipper-availability-calculator-application.log ruby 30746 td-agent 188u REG 8,2 232 923502 /var/log/td-agent/clipper_availability_calculator_performance.pos ruby 30746 td-agent 189u REG 0,9 0 3853 [eventpoll] ruby 30746 td-agent 190r DIR 0,10 0 1 inotify ruby 30746 td-agent 191r REG 8,16 3033699375 11141144 /volr/removed_for_security/bbcalt-app01/clipper-availability-calculator-performance.log ruby 30746 td-agent 192r REG 8,16 7499664212 13238296 /volr/removed_for_security/bbcalt-app01/clipper-availability-calculator-performance.log

gems installed

td-agent-gem list

LOCAL GEMS

actionmailer (4.2.8) actionpack (4.2.8) actionview (4.2.8) activejob (4.2.8) activemodel (4.2.8) activerecord (4.2.8) activesupport (4.2.8) addressable (2.5.1) arel (6.0.4) aws-sdk (2.9.9) aws-sdk-core (2.9.9) aws-sdk-resources (2.9.9) aws-sigv4 (1.0.0) bigdecimal (1.2.4) bson (4.1.1) builder (3.2.3) bundler (1.14.5) bzip2-ffi (1.0.0) celluloid (0.15.2) cool.io (1.4.6) diff-lcs (1.3) draper (1.4.0) elasticsearch (5.0.4) elasticsearch-api (5.0.4) elasticsearch-transport (5.0.4) erubis (2.7.0) excon (0.55.0) faraday (0.12.1) ffi (1.9.18) fluent-logger (0.7.1) fluent-mixin-config-placeholders (0.4.0) fluent-mixin-plaintextformatter (0.2.6) fluent-plugin-elasticsearch (1.9.5) fluent-plugin-kafka (0.5.5) fluent-plugin-mongo (0.8.0) fluent-plugin-rewrite-tag-filter (1.5.5) fluent-plugin-s3 (0.8.2) fluent-plugin-scribe (0.10.14) fluent-plugin-td (0.10.29) fluent-plugin-td-monitoring (0.2.2) fluent-plugin-webhdfs (0.4.2) fluentd (0.12.35) fluentd-ui (0.4.4) font-awesome-rails (4.7.0.1) globalid (0.4.0) haml (4.0.7) haml-rails (0.5.3) hike (1.2.3) hirb (0.7.3) http_parser.rb (0.6.0) httpclient (2.8.2.4) i18n (0.8.1) io-console (0.4.3) ipaddress (0.8.3) jbuilder (2.6.3) jmespath (1.3.1) jquery-rails (3.1.4) json (1.8.1) kramdown (1.13.2) kramdown-haml (0.0.3) loofah (2.0.3) ltsv (0.1.0) mail (2.6.4) mime-types (3.1) mime-types-data (3.2016.0521) mini_portile2 (2.1.0) minitest (5.10.1, 4.7.5) mixlib-cli (1.7.0) mixlib-config (2.2.4) mixlib-log (1.7.1) mixlib-shellout (2.2.7) mongo (2.2.7) msgpack (1.0.3) multi_json (1.12.1) multipart-post (2.0.0) nokogiri (1.7.1) ohai (6.20.0) oj (2.18.1) parallel (1.8.0) psych (2.0.5) public_suffix (2.0.5) puma (3.8.2) rack (1.6.5) rack-test (0.6.3) rails (4.2.8) rails-deprecated_sanitizer (1.0.3) rails-dom-testing (1.0.8) rails-html-sanitizer (1.0.3) railties (4.2.8) rake (10.1.0) rdoc (4.1.0) request_store (1.3.2) ruby-kafka (0.3.17) ruby-progressbar (1.8.1) rubyzip (1.2.1, 1.1.7) sass (3.2.19) sass-rails (4.0.5) settingslogic (2.0.9) sigdump (0.2.4) sprockets (2.12.4) sprockets-rails (2.3.3) string-scrub (0.0.5) sucker_punch (1.0.5) systemu (2.5.2) td (0.15.2) td-client (0.8.85) td-logger (0.3.26) test-unit (2.1.10.0) thor (0.19.4) thread_safe (0.3.6) thrift (0.8.0) tilt (1.4.1) timers (1.1.0) tzinfo (1.2.3) tzinfo-data (1.2017.2) uuidtools (2.1.5) webhdfs (0.8.0) yajl-ruby (1.3.0) zip-zip (0.3)



the only additional gems I have added are
elasticsearch-api-5.0.4', 'multipart-post-2.0.0', 'faraday-0.12.1', 'elasticsearch-transport-5.0.4' 'elasticsearch-5.0.4', 'excon-0.55.0', 'fluent-plugin-elasticsearch-1.9.5

I have spent close to a week trying to figure this out and just cannot find what is causing the issue. I dont know what the timout is from, since in proc the pid mentioned in the strace doesnt exist. a lower pid number does exist and looking at the FD I can see its links to the files. I'm really stumped, any guidance is appreciated.
repeatedly commented 7 years ago

Could you paste sigdump logs?

http://docs.fluentd.org/v0.12/articles/trouble-shooting#dump-fluentd-internal-information

ghost commented 7 years ago

ugh facepalm. I think I found the culprit... So basically something I noticed was that td-agent either does not like log rotation, or our log rotation. As a result I see that it ends up following deleted or renamed files. To deal with that I would restart td-agent, which usually wouldn't restart because a thread got hung up on something. Once that offending thread was killed td would finally shutdown. However the pos files I believe is what caused my issues. I deleted them all today and bam it is monitoring all the logs again. I will continue to troubleshoot why its hanging onto the deleted files and stability. Once i get more ill post on here.

ghost commented 7 years ago

well spoke to soon. Looks like it still gets hung and ends up getting to the timeouts again on the strace. Running a sigdump on the main process produces the following

Sigdump at 2017-06-01 08:39:47 -0500 process 14043 (/usr/sbin/td-agent)
  Thread #<Thread:0x007fc56d81a7b0> status=run priority=0
      /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/sigdump-0.2.4/lib/sigdump.rb:52:in `backtrace'
      /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/sigdump-0.2.4/lib/sigdump.rb:52:in `dump_backtrace'
      /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/sigdump-0.2.4/lib/sigdump.rb:34:in `block in dump_all_thread_backtrace'
      /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/sigdump-0.2.4/lib/sigdump.rb:33:in `each'
      /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/sigdump-0.2.4/lib/sigdump.rb:33:in `dump_all_thread_backtrace'
      /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/sigdump-0.2.4/lib/sigdump.rb:16:in `block in dump'
      /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/sigdump-0.2.4/lib/sigdump.rb:136:in `open'
      /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/sigdump-0.2.4/lib/sigdump.rb:136:in `_open_dump_path'
      /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/sigdump-0.2.4/lib/sigdump.rb:14:in `dump'
      /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/sigdump-0.2.4/lib/sigdump.rb:7:in `block in setup'
      /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/supervisor.rb:350:in `call'
      /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/supervisor.rb:350:in `waitpid'
      /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/supervisor.rb:350:in `supervise'
      /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/supervisor.rb:156:in `start'
      /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/command/fluentd.rb:173:in `<top (required)>'
      /opt/td-agent/embedded/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:54:in `require'
      /opt/td-agent/embedded/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:54:in `require'
      /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/bin/fluentd:5:in `<top (required)>'
      /opt/td-agent/embedded/bin/fluentd:23:in `load'
      /opt/td-agent/embedded/bin/fluentd:23:in `<top (required)>'
      /usr/sbin/td-agent:7:in `load'
      /usr/sbin/td-agent:7:in `<main>'
  GC stat:
      count: 21
      heap_used: 261
      heap_length: 261
      heap_increment: 0
      heap_live_slot: 106022
      heap_free_slot: 362
      heap_final_slot: 0
      heap_swept_slot: 9681
      heap_eden_page_length: 261
      heap_tomb_page_length: 0
      total_allocated_object: 483133
      total_freed_object: 377111
      malloc_increase: 68096
      malloc_limit: 16777216
      minor_gc_count: 16
      major_gc_count: 5
      remembered_shady_object: 704
      remembered_shady_object_limit: 798
      old_object: 45523
      old_object_limit: 69238
      oldmalloc_increase: 6519640
      oldmalloc_limit: 16777216
  Built-in objects:
   106,384: TOTAL
    75,000: T_STRING
    10,388: T_ARRAY
     8,160: T_NODE
     4,987: T_DATA
     2,180: T_OBJECT
     2,178: T_REGEXP
     2,023: T_CLASS
       741: T_HASH
       272: FREE
       138: T_MODULE
       130: T_ICLASS
        77: T_FILE
        59: T_RATIONAL
        36: T_STRUCT
         9: T_FLOAT
         4: T_BIGNUM
         1: T_COMPLEX
         1: T_MATCH

The child process sending a SIGCONT does noting it seems. However re-running an strace on the child it looks like its trying to restart now?

pid 16309] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 16307] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 16303] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 16302] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 16300] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 16297] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14193] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14192] ppoll([{fd=900, events=POLLIN}], 1, NULL, NULL, 8 <unfinished ...>
[pid 14191] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14190] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14189] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14188] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14187] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14186] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14185] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14184] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14183] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14182] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14181] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14180] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14179] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14178] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14177] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14176] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14175] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14174] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14173] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14172] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14171] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14170] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14169] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14168] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14167] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14166] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14165] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14164] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14163] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14162] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14161] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14160] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14159] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14158] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14157] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14156] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14155] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14154] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14153] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14152] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14151] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14150] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14149] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14148] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14147] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14146] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14145] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14144] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14143] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14142] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14141] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14140] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14139] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14138] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14137] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14136] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14135] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14134] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14133] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14132] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14131] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14130] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14129] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14128] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14127] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14126] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14125] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14124] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14123] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14122] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14121] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14120] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14119] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14118] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14117] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14116] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14115] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14114] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14113] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14112] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14111] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14110] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14109] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14108] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14107] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14106] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14105] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14104] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14103] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14102] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14101] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14100] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14099] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14098] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14097] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14096] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14095] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14094] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14093] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14092] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14091] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14090] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14089] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14088] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14087] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14086] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14085] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14084] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14083] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14082] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14081] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14080] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14079] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14078] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14077] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14076] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14075] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14074] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14073] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14072] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14071] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14070] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14069] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14068] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14067] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14066] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14065] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14064] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14063] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14062] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14061] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14060] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14059] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14058] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14051] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14050] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14049] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... restart_syscall resumed> ) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14048] <... tgkill resumed> )      = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] <... tgkill resumed> )      = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14048] <... tgkill resumed> )      = 0
[pid 14046] rt_sigreturn( <unfinished ...>
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] <... rt_sigreturn resumed> ) = -1 EINTR (Interrupted system call)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] <... tgkill resumed> )      = 0
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14048] <... tgkill resumed> )      = 0
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] <... tgkill resumed> )      = 0
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14048] <... tgkill resumed> )      = 0
[pid 14046] rt_sigreturn( <unfinished ...>
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] <... rt_sigreturn resumed> ) = -1 EINTR (Interrupted system call)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] <... tgkill resumed> )      = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] <... tgkill resumed> )      = 0
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] <... tgkill resumed> )      = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14048] <... tgkill resumed> )      = 0
[pid 14046] rt_sigreturn( <unfinished ...>
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] <... rt_sigreturn resumed> ) = -1 EINTR (Interrupted system call)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] <... tgkill resumed> )      = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] <... tgkill resumed> )      = 0
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] <... tgkill resumed> )      = 0
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn( <unfinished ...>
[pid 14048] <... tgkill resumed> )      = 0
[pid 14046] <... rt_sigreturn resumed> ) = -1 EINTR (Interrupted system call)
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] <... tgkill resumed> )      = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] <... tgkill resumed> )      = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] <... tgkill resumed> )      = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] <... tgkill resumed> )      = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14048] <... tgkill resumed> )      = 0
[pid 14046] rt_sigreturn( <unfinished ...>
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] <... rt_sigreturn resumed> ) = -1 EINTR (Interrupted system call)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14048] tgkill(14046, 14046, SIGVTALRM <unfinished ...>
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14048] <... tgkill resumed> )      = 0
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL <unfinished ...>
[pid 14048] <... poll resumed> )        = 0 (Timeout)
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] tgkill(14046, 14046, SIGVTALRM) = 0
[pid 14048] poll([{fd=6, events=POLLIN}], 1, 100 <unfinished ...>
[pid 14046] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 14046] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=14046, si_uid=496} ---
[pid 14046] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid 14046] futex(0x7fc573a49044, FUTEX_WAIT_PRIVATE, 464391, NULL^CProcess 14046 detached
 <detached ...>
Process 14048 detached
Process 14049 detached
Process 14050 detached
Process 14051 detached
Process 14058 detached
Process 14059 detached
Process 14060 detached
Process 14061 detached
Process 14062 detached
Process 14063 detached
Process 14064 detached
Process 14065 detached
Process 14066 detached
Process 14067 detached
Process 14068 detached
Process 14069 detached
Process 14070 detached
Process 14071 detached
Process 14072 detached
Process 14073 detached
Process 14074 detached
Process 14075 detached
Process 14076 detached
Process 14077 detached
Process 14078 detached
Process 14079 detached
Process 14080 detached
Process 14081 detached
Process 14082 detached
Process 14083 detached
Process 14084 detached
Process 14085 detached
Process 14086 detached
Process 14087 detached
Process 14088 detached
Process 14089 detached
Process 14090 detached
Process 14091 detached
Process 14092 detached
Process 14093 detached
Process 14094 detached
Process 14095 detached
Process 14096 detached
Process 14097 detached
Process 14098 detached
Process 14099 detached
Process 14100 detached
Process 14101 detached
Process 14102 detached
Process 14103 detached
Process 14104 detached
Process 14105 detached
Process 14106 detached
Process 14107 detached
Process 14108 detached
Process 14109 detached
Process 14110 detached
Process 14111 detached
Process 14112 detached
Process 14113 detached
Process 14114 detached
Process 14115 detached
Process 14116 detached
Process 14117 detached
Process 14118 detached
Process 14119 detached
Process 14120 detached
Process 14121 detached
Process 14122 detached
Process 14123 detached
Process 14124 detached
Process 14125 detached
Process 14126 detached
Process 14127 detached
Process 14128 detached
Process 14129 detached
Process 14130 detached
Process 14131 detached
Process 14132 detached
Process 14133 detached
Process 14134 detached
Process 14135 detached
Process 14136 detached
Process 14137 detached
Process 14138 detached
Process 14139 detached
Process 14140 detached
Process 14141 detached
Process 14142 detached
Process 14143 detached
Process 14144 detached
Process 14145 detached
Process 14146 detached
Process 14147 detached
Process 14148 detached
Process 14149 detached
Process 14150 detached
Process 14151 detached
Process 14152 detached
Process 14153 detached
Process 14154 detached
Process 14155 detached
Process 14156 detached
Process 14157 detached
Process 14158 detached
Process 14159 detached
Process 14160 detached
Process 14161 detached
Process 14162 detached
Process 14163 detached
Process 14164 detached
Process 14165 detached
Process 14166 detached
Process 14167 detached
Process 14168 detached
Process 14169 detached
Process 14170 detached
Process 14171 detached
Process 14172 detached
Process 14173 detached
Process 14174 detached
Process 14175 detached
Process 14176 detached
Process 14177 detached
Process 14178 detached
Process 14179 detached
Process 14180 detached
Process 14181 detached
Process 14182 detached
Process 14183 detached
Process 14184 detached
Process 14185 detached
Process 14186 detached
Process 14187 detached
Process 14188 detached
Process 14189 detached
Process 14190 detached
Process 14191 detached
Process 14192 detached
Process 14193 detached
Process 16297 detached
Process 16300 detached
Process 16302 detached
Process 16303 detached
Process 16307 detached
Process 16309 detached
ghost commented 7 years ago

interestingly through that troubleshooting page, running without daemon. That actually see,s to fix the deadlocks, and allow td-agent to actually restart, however it restarts a lot, almost ever minute.

2017-06-01 09:04:30 -0500 [info]: process finished code=11
2017-06-01 09:05:58 -0500 [info]: process finished code=11
2017-06-01 09:06:26 -0500 [info]: process finished code=11
2017-06-01 09:07:05 -0500 [info]: process finished code=0
2017-06-01 09:08:12 -0500 [info]: process finished code=0
2017-06-01 09:09:20 -0500 [info]: process finished code=0
2017-06-01 09:10:29 -0500 [info]: process finished code=0
2017-06-01 09:11:19 -0500 [info]: process finished code=11
2017-06-01 09:11:38 -0500 [info]: process finished code=0
2017-06-01 09:12:47 -0500 [info]: process finished code=0
2017-06-01 09:13:54 -0500 [info]: process finished code=0
2017-06-01 09:15:01 -0500 [info]: process finished code=0
2017-06-01 09:16:31 -0500 [info]: process finished code=256
2017-06-01 09:17:18 -0500 [info]: process finished code=11
2017-06-01 09:17:43 -0500 [info]: process finished code=0
2017-06-01 09:18:50 -0500 [info]: process finished code=0
2017-06-01 09:19:58 -0500 [info]: process finished code=0
2017-06-01 09:21:05 -0500 [info]: process finished code=0
2017-06-01 09:22:12 -0500 [info]: process finished code=0
2017-06-01 09:23:18 -0500 [info]: process finished code=0
2017-06-01 09:24:26 -0500 [info]: process finished code=0
2017-06-01 09:25:36 -0500 [info]: process finished code=0
2017-06-01 09:26:47 -0500 [info]: process finished code=0
2017-06-01 09:28:50 -0500 [info]: process finished code=0
2017-06-01 09:31:03 -0500 [info]: process finished code=0
2017-06-01 09:33:17 -0500 [info]: process finished code=0
2017-06-01 09:35:06 -0500 [info]: process finished code=0
2017-06-01 09:37:01 -0500 [info]: process finished code=0
2017-06-01 09:38:53 -0500 [info]: process finished code=6
ghost commented 7 years ago

The output from actually running it non demonized produces this. Removed duplicate entries to save on space

LD_PRELOAD=/opt/td-agent/embedded/lib/libjemalloc.so /usr/sbin/td-agent -c /etc/td-agent/td-agent.conf --user td-agent --group td-agent --log /var/log/td-agent/td-agent.log
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/cool.io-1.4.6/lib/cool.io/loop.rb:88: [BUG] Segmentation fault at 0x00000000000068
ruby 2.1.10p492 (2016-04-01 revision 54464) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0034 p:---- s:0154 e:000140 CFUNC  :initialize
c:0033 p:0401 s:0151 e:000150 METHOD /opt/td-agent/embedded/lib/ruby/2.1.0/time.rb:264
c:0032 p:0242 s:0136 e:000135 METHOD /opt/td-agent/embedded/lib/ruby/2.1.0/time.rb:410
c:0031 p:---- s:0122 e:000121 CFUNC  :divmod
c:0030 p:---- s:0120 e:000119 CFUNC  :to_i
c:0029 p:0128 s:0117 e:000116 METHOD /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/parser.rb:99
c:0028 p:0014 s:0111 e:000110 BLOCK  /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/parser.rb:218 [FINISH]
c:0027 p:---- s:0112 e:000111 CFUNC  :write
c:0026 p:---- s:0105 e:000104 CFUNC  :current
c:0025 p:---- s:0101 e:000100 CFUNC  :unlock
c:0024 p:0048 s:0098 e:000097 METHOD /opt/td-agent/embedded/lib/ruby/2.1.0/monitor.rb:199
c:0023 p:0010 s:0095 e:000094 RESCUE
c:0022 p:0020 s:0092 e:000090 METHOD /opt/td-agent/embedded/lib/ruby/2.1.0/monitor.rb:213
c:0021 p:0015 s:0088 e:000087 METHOD /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/buffer.rb:193
c:0020 p:0053 s:0082 e:000081 METHOD /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/output.rb:468
c:0019 p:0070 s:0074 e:000073 METHOD /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/output.rb:43
c:0018 p:0098 s:0070 e:000069 METHOD /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/plugin/out_copy.rb:78
c:0017 p:0022 s:0063 e:000062 METHOD /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/event_router.rb:90
c:0016 p:0082 s:0057 e:000056 METHOD /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/plugin/in_tail.rb:313 [FINISH]
c:0015 p:---- s:0050 e:000049 IFUNC
c:0014 p:---- s:0048 e:000047 CFUNC  :call
c:0013 p:0015 s:0043 e:000042 METHOD /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/plugin/in_tail.rb:422 [FINISH]
c:0012 p:---- s:0041 e:000040 CFUNC  :write
c:0011 p:---- s:0036 e:000035 CFUNC  :clear
c:0010 p:0214 s:0033 e:000032 METHOD /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/plugin/in_tail.rb:621
c:0009 p:0052 s:0028 e:000027 METHOD /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/plugin/in_tail.rb:448 [FINISH]
c:0008 p:---- s:0025 e:000024 IFUNC
c:0007 p:---- s:0023 e:000022 CFUNC  :call
c:0006 p:0013 s:0020 e:000019 METHOD /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/plugin/in_tail.rb:549 [FINISH]
c:0005 p:---- s:0015 e:000014 CFUNC  :run_once
c:0004 p:0049 s:0011 e:000010 METHOD /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/cool.io-1.4.6/lib/cool.io/loop.rb:88
c:0003 p:0009 s:0007 e:000006 METHOD /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/plugin/in_tail.rb:288 [FINISH]
c:0002 p:---- s:0004 e:000003 IFUNC
c:0001 p:---- s:0002 e:000001 TOP    [FINISH]

/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/output.rb:149:in `run'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/output.rb:342:in `try_flush'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/buffer.rb:333:in `pop'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/buffer.rb:354:in `write_chunk'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/output.rb:490:in `write'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-elasticsearch-1.9.5/lib/fluent/plugin/out_elasticsearch.rb:341:in `write_objects'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-elasticsearch-1.9.5/lib/fluent/plugin/out_elasticsearch.rb:355:in `send_bulk'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/elasticsearch-api-5.0.4/lib/elasticsearch/api/actions/bulk.rb:95:in `bulk'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/client.rb:131:in `perform_request'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/http/faraday.rb:20:in `perform_request'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/base.rb:252:in `perform_request'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/base.rb:289:in `rescue in perform_request'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/base.rb:80:in `reload_connections!'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/sniffer.rb:30:in `hosts'
/opt/td-agent/embedded/lib/ruby/2.1.0/timeout.rb:100:in `timeout'
/opt/td-agent/embedded/lib/ruby/2.1.0/timeout.rb:100:in `call'
/opt/td-agent/embedded/lib/ruby/2.1.0/timeout.rb:90:in `block in timeout'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/sniffer.rb:31:in `block in hosts'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/http/faraday.rb:20:in `perform_request'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/base.rb:252:in `perform_request'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/base.rb:289:in `rescue in perform_request'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/base.rb:80:in `reload_connections!'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/sniffer.rb:30:in `hosts'
/opt/td-agent/embedded/lib/ruby/2.1.0/timeout.rb:100:in `timeout'
/opt/td-agent/embedded/lib/ruby/2.1.0/timeout.rb:100:in `call'
/opt/td-agent/embedded/lib/ruby/2.1.0/timeout.rb:90:in `block in timeout'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/sniffer.rb:31:in `block in hosts'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/http/faraday.rb:20:in `perform_request'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/base.rb:252:in `perform_request'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/base.rb:289:in `rescue in perform_request'

-- C level backtrace information -------------------------------------------
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/plugin/in_tail.rb: [BUG] vm_call0_cfunc_with_frame: cfp consistency error
ruby 2.1.10p492 (2016-04-01 revision 54464) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0004 p:0038 s:0010 e:000009 RESCUE
c:0003 p:0010 s:0007 e:000006 METHOD /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/plugin/in_tail.rb:288 [FINISH]
c:0002 p:---- s:0004 e:000003 IFUNC
c:0001 p:---- s:0002 e:000001 TOP    [FINISH]

/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/plugin/in_tail.rb:288:in `run'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/plugin/in_tail.rb:0:in `rescue in run'

-- C level backtrace information -------------------------------------------
ghost commented 7 years ago

so at this point what I have found is the previous version of td-agent does appear to work better and is a little more stable. The other thing I found is that when td-agent does get into a hung, deadlock state, and I see all those timeouts, i can simply kill that process and that seems to free up td-agent and allow it to restart. It seems there is some error that is happening thats causing td to want to restart and then it gets hung due to the thread that is stuck. At least thats my assumption.

ghost commented 7 years ago

so still debugging this issue. What im finding is that each time td gets into a hung state its due to this pipe. [pid 10436] poll([{fd=6, events=POLLIN}], 1, 100) = 0 (Timeout) ls -la /proc/10434/task/10436/fd/ | grep pipe lr-x------ 1 td-agent td-agent 64 Jun 2 13:08 6 -> pipe:[551046] l-wx------ 1 td-agent td-agent 64 Jun 2 13:08 7 -> pipe:[551046]

ruby 10434 td-agent 6r FIFO 0,8 0t0 551046 pipe ruby 10434 td-agent 7w FIFO 0,8 0t0 551046 pipe

Looks like some sort of pipe, maybe stdout? But its waiting for an event of some kind. My guess is maybe from the restart that was sent from the app, which is still a separate issue.

ghost commented 7 years ago

This can be closed. I think I found my issue or at least what was causing the problem. Why it cause an issue is not really of concern to me but maybe something that needs investigation. Basically in a nutshell, the reason it was restarting to begin with was because it was wanting to communicate with elastic, and needed to access an address in memory and getting access denied. which caused a segfault and caused td to want to restart. This caused all the threads to get a restart, however the piped thread, I'm assuming thats stdout didnt get the event so it was still asleep waiting. Looks like possibly and uncaught exception not allowing it to break from the loop. So ultimately I removed the buffering stuff from the elastic config and everything is just groovy.