elastic / logstash-forwarder

An experiment to cut logs in preparation for processing elsewhere. Replaced by Filebeat: https://github.com/elastic/beats/tree/master/filebeat
Other
1.79k stars 413 forks source link

FreeBSD 10 crashes after several minutes and returns no error version 0.4.0.20150507 #526

Closed markballew closed 9 years ago

markballew commented 9 years ago

I'm having an issue where logstash-forwarder is exiting after running successfully for 10-15 minutes. The machine is under some load (8 on a 8 core box) and 600MB into swap with 16GiB total / 2500MB free RAM. FreeBSD does pre-emptive page swapping, so I don't think it is the process killer.

I turned on logging for logstash itself, and I didn't find any smoking guns there. I grepped for the host name and IP of this machine in the log, but nothing was returned. If there is a better query for me to use, please let me know.

logstash-forwarder-0.4.0.20150507 Name : logstash-forwarder Version : 0.4.0.20150507

FreeBSD stuff 10.1-STABLE FreeBSD 10.1-STABLE #1 r275869: Wed Dec 17 13:23:39 PST 2014

logstash-forwarder -config logstash-forwarder.conf
2015/09/15 21:04:24.535945      --- options -------
2015/09/15 21:04:24.536020      config-arg:          logstash-forwarder.conf
2015/09/15 21:04:24.536034      idle-timeout:        5s
2015/09/15 21:04:24.536040      spool-size:          1024
2015/09/15 21:04:24.536046      harvester-buff-size: 16384
2015/09/15 21:04:24.536050      --- flags ---------
2015/09/15 21:04:24.536055      tail (on-rotation):  false
2015/09/15 21:04:24.536059      log-to-syslog:          false
2015/09/15 21:04:24.536064      quiet:             false
2015/09/15 21:04:24.536123 {
    "network": {
        "servers": [
            "log:6872"
        ],
        "ssl ca": "/usr/local/etc/logstash-forwarder.crt",
        "ssl key": "/usr/local/etc/logstash-forwarder.key"
    },
    "files": [
        {
            "paths": [
                "/var/log/*.log",
                "/var/log/messages"
            ],
            "fields": {
                "type": "stuff",
                "host": "stuff01"
            }
        }
    ]
}
2015/09/15 22:29:13.278881 Loading registrar data from /usr/local/etc/.logstash-forwarder
2015/09/15 22:29:13.279033 Waiting for 1 prospectors to initialise
2015/09/15 22:29:13.279267 Resuming harvester on a previously harvested file: /var/log/all.log
2015/09/15 22:29:13.279349 harvest: "/var/log/all.log" position:63 (offset snapshot:63)
2015/09/15 22:29:13.279437 Registrar will re-save state for /var/log/all.log
2015/09/15 22:29:13.279461 Registrar will re-save state for /var/log/auth.log
2015/09/15 22:29:13.279484 Resuming harvester on a previously harvested file: /var/log/auth.log
2015/09/15 22:29:13.279599 Skipping file (older than dead time of 24h0m0s): /var/log/console.log
2015/09/15 22:29:13.279626 Skipping file (older than dead time of 24h0m0s): /var/log/debug.log
2015/09/15 22:29:13.279652 Skipping file (older than dead time of 24h0m0s): /var/log/devd.log
2015/09/15 22:29:13.279673 Resuming harvester on a previously harvested file: /var/log/jenkins.log
2015/09/15 22:29:13.279695 Skipping file (older than dead time of 24h0m0s): /var/log/nginx-access.log
2015/09/15 22:29:13.279714 Skipping file (older than dead time of 24h0m0s): /var/log/nginx-error.log
2015/09/15 22:29:13.279735 Skipping file (older than dead time of 24h0m0s): /var/log/ppp.log
2015/09/15 22:29:13.279753 Registrar will re-save state for /var/log/jenkins.log
2015/09/15 22:29:13.279781 Resuming harvester on a previously harvested file: /var/log/snmpd.log
2015/09/15 22:29:13.279845 harvest: "/var/log/auth.log" position:73146 (offset snapshot:73146)
2015/09/15 22:29:13.279888 harvest: "/var/log/snmpd.log" position:7270402 (offset snapshot:7270402)
2015/09/15 22:29:13.279967 Registrar will re-save state for /var/log/snmpd.log
2015/09/15 22:29:13.279983 Registrar will re-save state for /var/log/utx.log
2015/09/15 22:29:13.279992 Resuming harvester on a previously harvested file: /var/log/utx.log
2015/09/15 22:29:13.280040 harvest: "/var/log/jenkins.log" position:3246942 (offset snapshot:3246942)
2015/09/15 22:29:13.280049 Registrar will re-save state for /var/log/messages
2015/09/15 22:29:13.280044 Resuming harvester on a previously harvested file: /var/log/messages
2015/09/15 22:29:13.280097 harvest: "/var/log/utx.log" position:19675 (offset snapshot:19675)
2015/09/15 22:29:13.280292 harvest: "/var/log/messages" position:63650 (offset snapshot:63650)
2015/09/15 22:29:13.280371 All prospectors initialised with 6 states to persist
2015/09/15 22:29:13.280446 Setting trusted CA from file: /usr/local/etc/logstash-forwarder.crt
2015/09/15 22:29:13.281959 Connecting to [10.1.1.1]:6872 (log)
2015/09/15 22:29:13.419100 Connected to 10.1.1.1
2015/09/15 22:29:15.277079 Registrar: processing 1024 events
2015/09/15 22:29:17.008552 Registrar: processing 1024 events
2015/09/15 22:29:17.709635 Registrar: processing 1024 events
2015/09/15 22:29:18.182662 Registrar: processing 1024 events
2015/09/15 22:29:18.562691 Registrar: processing 1024 events
2015/09/15 22:29:19.055485 Registrar: processing 1024 events
2015/09/15 22:29:20.405845 Registrar: processing 1024 events
2015/09/15 22:29:20.937931 Registrar: processing 1024 events
2015/09/15 22:29:21.430444 Registrar: processing 1024 events
2015/09/15 22:29:22.420010 Registrar: processing 1024 events
2015/09/15 22:29:23.029002 Registrar: processing 1024 events
2015/09/15 22:29:23.519409 Registrar: processing 1024 events
2015/09/15 22:29:24.267957 Registrar: processing 1024 events
2015/09/15 22:29:24.735137 Registrar: processing 1024 events
2015/09/15 22:29:25.196391 Registrar: processing 1024 events
2015/09/15 22:29:25.551850 Registrar: processing 1024 events
2015/09/15 22:29:25.803412 Registrar: processing 1024 events
2015/09/15 22:29:26.757106 Registrar: processing 1024 events
2015/09/15 22:29:27.228561 Registrar: processing 1024 events
2015/09/15 22:29:27.688852 Registrar: processing 1024 events
2015/09/15 22:29:28.006901 Registrar: processing 1024 events
2015/09/15 22:29:29.070111 Registrar: processing 1024 events
2015/09/15 22:29:29.387580 Registrar: processing 1024 events
2015/09/15 22:29:29.745962 Registrar: processing 1024 events
2015/09/15 22:29:35.788769 Registrar: processing 20 events
2015/09/15 22:29:43.285691 Registrar: processing 7 events
2015/09/15 22:29:48.286190 Registrar: processing 2 events
2015/09/15 22:30:03.285278 Registrar: processing 3 events
2015/09/15 22:30:38.284812 Registrar: processing 1 events
2015/09/15 22:30:45.784513 Registrar: processing 1 events
2015/09/15 22:30:50.784897 Registrar: processing 1 events
2015/09/15 22:31:18.284663 Registrar: processing 1 events
2015/09/15 22:31:25.784732 Registrar: processing 1 events
2015/09/15 22:31:33.285173 Registrar: processing 6 events
2015/09/15 22:31:43.284790 Registrar: processing 1 events
2015/09/15 22:31:58.284707 Registrar: processing 1 events
2015/09/15 22:32:05.784774 Registrar: processing 1 events
2015/09/15 22:32:13.284520 Registrar: processing 1 events
2015/09/15 22:32:18.284730 Registrar: processing 1 events
2015/09/15 22:32:25.784264 Registrar: processing 1 events
2015/09/15 22:32:33.284658 Registrar: processing 1 events
2015/09/15 22:32:36.765663 Registrar: processing 1024 events
2015/09/15 22:32:37.738269 Registrar: processing 1024 events
2015/09/15 22:32:43.523254 Registrar: processing 880 events
2015/09/15 22:33:00.784394 Registrar: processing 1 events
2015/09/15 22:34:08.284547 Registrar: processing 1 events
2015/09/15 22:34:15.787690 Registrar: processing 17 events
2015/09/15 22:34:23.284708 Registrar: processing 3 events
2015/09/15 22:34:28.284951 Registrar: processing 2 events
2015/09/15 22:34:35.784582 Registrar: processing 2 events
2015/09/15 22:35:00.784988 Registrar: processing 3 events
2015/09/15 22:35:05.784945 Registrar: processing 1 events
2015/09/15 22:35:33.284828 Registrar: processing 1 events
2015/09/15 22:35:40.785273 Registrar: processing 1 events
2015/09/15 22:35:45.786115 Registrar: processing 1 events
2015/09/15 22:35:50.785527 Registrar: processing 1 events
2015/09/15 22:35:58.285178 Registrar: processing 3 events
2015/09/15 22:36:28.284228 Registrar: processing 1 events
2015/09/15 22:36:35.786454 Registrar: processing 7 events
2015/09/15 22:36:40.784312 Registrar: processing 1 events
2015/09/15 22:36:48.284701 Registrar: processing 1 events
2015/09/15 22:36:55.267243 Registrar: processing 1024 events
2015/09/15 22:36:56.262592 Registrar: processing 1024 events
2015/09/15 22:37:03.547103 Registrar: processing 879 events
2015/09/15 22:38:03.286321 Registrar: processing 9 events
2015/09/15 22:39:05.785554 Registrar: processing 1 events
2015/09/15 22:39:13.288249 Registrar: processing 16 events
2015/09/15 22:39:20.784818 Registrar: processing 4 events
2015/09/15 22:39:25.784778 Registrar: processing 3 events
2015/09/15 22:39:33.283424 Registrar: processing 2 events
2015/09/15 22:40:00.785015 Registrar: processing 3 events
2015/09/15 22:40:40.784817 Registrar: processing 1 events
2015/09/15 22:40:48.284721 Registrar: processing 2 events
2015/09/15 22:40:53.284212 Registrar: processing 1 events
2015/09/15 22:41:00.784725 Registrar: processing 3 events
2015/09/15 22:41:05.784137 Registrar: processing 1 events
2015/09/15 22:41:25.785298 Registrar: processing 1 events
2015/09/15 22:41:30.787768 Registrar: processing 7 events
2015/09/15 22:41:38.290597 Registrar: processing 2 events
2015/09/15 22:41:43.831485 Registrar: processing 1024 events
2015/09/15 22:41:44.840970 Registrar: processing 1024 events
2015/09/15 22:41:51.033875 Registrar: processing 878 events
2015/09/15 22:43:25.784273 Registrar: processing 1 events
2015/09/15 22:44:00.784446 Registrar: processing 1 events
2015/09/15 22:44:08.312120 Registrar: processing 2 events
2015/09/15 22:44:13.286832 Registrar: processing 15 events
2015/09/15 22:44:20.785378 Registrar: processing 7 events
2015/09/15 22:44:28.284394 Registrar: processing 3 events
2015/09/15 22:44:33.284336 Registrar: processing 1 events
2015/09/15 22:45:00.784644 Registrar: processing 3 events
2015/09/15 22:45:28.285300 Registrar: processing 1 events
2015/09/15 22:45:38.284415 Registrar: processing 1 events
2015/09/15 22:45:45.784689 Registrar: processing 1 events
2015/09/15 22:45:50.784425 Registrar: processing 2 events
2015/09/15 22:45:58.285997 Registrar: processing 1 events
2015/09/15 22:46:08.286730 Registrar: processing 1 events
2015/09/15 22:46:13.284607 Registrar: processing 1 events
2015/09/15 22:46:18.284414 Registrar: processing 2 events
2015/09/15 22:46:30.784478 Registrar: processing 1 events
2015/09/15 22:46:35.785276 Registrar: processing 6 events
2015/09/15 22:46:45.454443 Registrar: processing 1024 events
2015/09/15 22:46:46.453377 Registrar: processing 1024 events
2015/09/15 22:46:53.523913 Registrar: processing 878 events
2015/09/15 22:47:25.784274 Registrar: processing 1 events
2015/09/15 22:49:08.284789 Registrar: processing 3 events
2015/09/15 22:49:15.786878 Registrar: processing 16 events
2015/09/15 22:49:23.284750 Registrar: processing 6 events
2015/09/15 22:49:30.784618 Registrar: processing 3 events
2015/09/15 22:50:00.784155 Registrar: processing 3 events
2015/09/15 22:50:18.285156 Registrar: processing 4 events
2015/09/15 22:50:25.784159 Registrar: processing 1 events
2015/09/15 22:50:33.285253 Registrar: processing 4 events
2015/09/15 22:50:38.284091 Registrar: processing 1 events
2015/09/15 22:50:45.807147 Registrar: processing 3 events
2015/09/15 22:50:53.284049 Registrar: processing 1 events
2015/09/15 22:51:03.284371 Registrar: processing 2 events
2015/09/15 22:51:15.784103 Registrar: processing 1 events
2015/09/15 22:51:23.284625 Registrar: processing 1 events
2015/09/15 22:51:30.784734 Registrar: processing 6 events
2015/09/15 22:51:45.804388 Registrar: processing 84 events
2015/09/15 22:51:47.840366 Registrar: processing 1024 events
2015/09/15 22:51:48.114400 Registrar: processing 1024 events
2015/09/15 22:51:53.530913 Registrar: processing 792 events
2015/09/15 22:52:15.784230 Registrar: processing 1 events
2015/09/15 22:54:08.284389 Registrar: processing 3 events
2015/09/15 22:54:15.787171 Registrar: processing 16 events
2015/09/15 22:54:23.284763 Registrar: processing 7 events
2015/09/15 22:54:28.284578 Registrar: processing 2 events
2015/09/15 22:54:35.783819 Registrar: processing 1 events
2015/09/15 22:55:03.286119 Registrar: processing 4 events
2015/09/15 22:55:10.783813 Registrar: processing 1 events
2015/09/15 22:55:18.284224 Registrar: processing 1 events
2015/09/15 22:55:28.283974 Registrar: processing 1 events
2015/09/15 22:55:38.284347 Registrar: processing 1 events
2015/09/15 22:55:48.291560 Registrar: processing 1 events
2015/09/15 22:55:55.784293 Registrar: processing 1 events
2015/09/15 22:56:15.783813 Registrar: processing 1 events
2015/09/15 22:56:28.284272 Registrar: processing 1 events
2015/09/15 22:56:35.784731 Registrar: processing 6 events
2015/09/15 22:56:43.288970 Registrar: processing 30 events
2015/09/15 22:56:43.631472 Registrar: processing 1024 events
2015/09/15 22:56:44.646340 Registrar: processing 1024 events
2015/09/15 22:56:51.015901 Registrar: processing 847 events
2015/09/15 22:59:05.792744 Registrar: processing 1 events
2015/09/15 22:59:13.286126 Registrar: processing 17 events
2015/09/15 22:59:18.284794 Registrar: processing 5 events
2015/09/15 22:59:23.284072 Registrar: processing 2 events
2015/09/15 22:59:28.284632 Registrar: processing 2 events
2015/09/15 22:59:35.783992 Registrar: processing 2 events
ruflin commented 9 years ago

@markballew Based on the log files it looks like logstash-forwarder gets killed without any notice as there is nothing in the log file. Does this always happen after 10-15 minutes or just from time-to-time?

markballew commented 9 years ago

I've tested this a few times. It exits out between 10 minutes and an hour consistently. I have a few more boxes in a similar configuration I'll check out today to see if they have the same issue.

jordansissel commented 9 years ago

I don't know what would be normal in lsf's execution that could cause it to terminate silently. Are you capturing stdout/stderr from the process also? Maybe that has some hints.

markballew commented 9 years ago

I'm just running it as:

logstash-forwarder -config logstash-forwarder.conf

If there are additional flags I can pass to produce more useful output, let me know.

jordansissel commented 9 years ago

At this time I have no hypothesis as to what is causing these silent exits. Can you truss/strace it and capture the output when it dies?

markballew commented 9 years ago

Sorry to waste your time jordan, I did a ktrace and saw the process get a SIGTERM. It looks like a dev installed a cronjob that is killing this on the hour by accident. Thanks for taking the time and thanks for this awesome project!