Graylog2 / graylog2-server

Free and open log management
https://www.graylog.org
Other
7.31k stars 1.05k forks source link

ERROR [IndexerSetupService] Could not connect to Elasticsearch: Connection refused #1926

Closed mshar039 closed 8 years ago

mshar039 commented 8 years ago

Problem description

Could not connect to Elasticsearch after 2 weeks of successful run.

I was able to run the graylog and elasticsearch services properly. After feeding logs file to the elasticsearch through logstash, cluster health became yellow and soon after that graylog released the elasticsearch cluster.

elasticsearch/elasticsearch.yml

cluster.name: graylog2
node.master: true
node.data: true
bootstrap.mlockall: true
network.host: 127.0.0.1
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["127.0.0.1:9300"]

graylog/server/server.conf

rest_listen_uri = http://127.0.0.1:12900/
elasticsearch_discovery_zen_ping_multicast_enabled = false
elasticsearch_discovery_zen_ping_unicast_hosts = 127.0.0.1:9300

Environment

2016-03-14T04:27:05.417+05:30 INFO  [CmdLineTool] Loaded plugins: [Anonymous Usage Statistics 1.2.1 [org.graylog.plugins.usagestatistics.UsageStatsPlugin]]
2016-03-14T04:27:05.498+05:30 INFO  [CmdLineTool] Running with JVM arguments: -Xms1g -Xmx1g -XX:NewRatio=1 -XX:+ResizeTLAB -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSClassUnloadingEnabled -XX:+UseParNewGC -XX:-OmitStackTraceInFastThrow -Xms2G -Xmx4G -Dlog4j.configuration=file:///etc/graylog/server/log4j.xml -Djava.library.path=/usr/share/graylog-server/lib/sigar -Dgraylog2.installation_source=rpm
2016-03-14T04:27:07.485+05:30 INFO  [InputBufferImpl] Message journal is enabled.
2016-03-14T04:27:07.667+05:30 INFO  [LogManager] Loading logs.
2016-03-14T04:27:07.727+05:30 INFO  [LogManager] Logs loading complete.
2016-03-14T04:27:07.728+05:30 INFO  [KafkaJournal] Initialized Kafka based journal at /var/lib/graylog-server/journal
2016-03-14T04:27:07.739+05:30 INFO  [InputBufferImpl] Initialized InputBufferImpl with ring size <65536> and wait strategy <BlockingWaitStrategy>, running 2 parallel message handlers.
2016-03-14T04:27:08.006+05:30 INFO  [NodeId] Node ID: dbc313e5-3eba-4cda-b8b1-6707b04a93d8
2016-03-14T04:27:08.234+05:30 INFO  [node] [graylog2-server] version[1.7.3], pid[333], build[05d4530/2015-10-15T09:14:17Z]
2016-03-14T04:27:08.234+05:30 INFO  [node] [graylog2-server] initializing ...
2016-03-14T04:27:08.304+05:30 INFO  [plugins] [graylog2-server] loaded [graylog-monitor], sites []
2016-03-14T04:27:10.127+05:30 INFO  [node] [graylog2-server] initialized
2016-03-14T04:27:10.174+05:30 INFO  [Version] HV000001: Hibernate Validator 5.2.2.Final
2016-03-14T04:27:10.302+05:30 INFO  [ProcessBuffer] Initialized ProcessBuffer with ring size <65536> and wait strategy <BlockingWaitStrategy>.
2016-03-14T04:27:11.682+05:30 INFO  [RulesEngineProvider] No static rules file loaded.
2016-03-14T04:27:11.730+05:30 INFO  [OutputBuffer] Initialized OutputBuffer with ring size <65536> and wait strategy <BlockingWaitStrategy>.
2016-03-14T04:27:12.441+05:30 INFO  [ServerBootstrap] Graylog server 1.3.3 (0fda9dc) starting up
2016-03-14T04:27:12.441+05:30 INFO  [ServerBootstrap] JRE: Oracle Corporation 1.8.0_71 on Linux 3.10.0-327.10.1.el7.x86_64
2016-03-14T04:27:12.441+05:30 INFO  [ServerBootstrap] Deployment: rpm
2016-03-14T04:27:12.441+05:30 INFO  [ServerBootstrap] OS: CentOS Linux release 7.2.1511 (Core)
2016-03-14T04:27:12.441+05:30 INFO  [ServerBootstrap] Arch: amd64
2016-03-14T04:27:12.560+05:30 INFO  [PeriodicalsService] Starting 24 periodicals ...
2016-03-14T04:27:12.564+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.ThroughputCalculator] periodical in [0s], polling every [1s].
2016-03-14T04:27:12.572+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.AlertScannerThread] periodical in [10s], polling every [60s].
2016-03-14T04:27:12.579+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.BatchedElasticSearchOutputFlushThread] periodical in [0s], polling every [1s].
2016-03-14T04:27:12.579+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.ClusterHealthCheckThread] periodical in [0s], polling every [20s].
2016-03-14T04:27:12.581+05:30 INFO  [node] [graylog2-server] starting ...
2016-03-14T04:27:12.584+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.ContentPackLoaderPeriodical] periodical, running forever.
2016-03-14T04:27:12.585+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.DeadLetterThread] periodical, running forever.
2016-03-14T04:27:12.586+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.GarbageCollectionWarningThread] periodical, running forever.
2016-03-14T04:27:12.586+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.IndexerClusterCheckerThread] periodical in [0s], polling every [30s].
2016-03-14T04:27:12.588+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.IndexRetentionThread] periodical in [0s], polling every [300s].
2016-03-14T04:27:12.589+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.IndexRotationThread] periodical in [0s], polling every [10s].
2016-03-14T04:27:12.589+05:30 INFO  [IndexRetentionThread] Elasticsearch cluster not available, skipping index retention checks.
2016-03-14T04:27:12.589+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.NodePingThread] periodical in [0s], polling every [1s].
2016-03-14T04:27:12.590+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.StreamThroughputCounterManagerThread] periodical in [0s], polling every [1s].
2016-03-14T04:27:12.598+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.VersionCheckThread] periodical in [300s], polling every [1800s].
2016-03-14T04:27:12.598+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.ThrottleStateUpdaterThread] periodical in [1s], polling every [1s].
2016-03-14T04:27:12.599+05:30 INFO  [Periodicals] Starting [org.graylog2.events.ClusterEventPeriodical] periodical in [0s], polling every [1s].
2016-03-14T04:27:12.599+05:30 INFO  [Periodicals] Starting [org.graylog2.events.ClusterEventCleanupPeriodical] periodical in [0s], polling every [300s].
2016-03-14T04:27:12.599+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.ClusterIdGeneratorPeriodical] periodical, running forever.
2016-03-14T04:27:12.600+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.PurgeExpiredCollectorsThread] periodical in [0s], polling every [3600s].
2016-03-14T04:27:12.600+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.IndexRangesMigrationPeriodical] periodical, running forever.
2016-03-14T04:27:12.601+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.IndexRangesCleanupPeriodical] periodical in [15s], polling every [3600s].
2016-03-14T04:27:12.907+05:30 INFO  [IndexerClusterCheckerThread] Indexer not fully initialized yet. Skipping periodic cluster check.
2016-03-14T04:27:12.942+05:30 INFO  [transport] [graylog2-server] bound_address {inet[/127.0.0.1:9350]}, publish_address {inet[/127.0.0.1:9350]}
2016-03-14T04:27:12.993+05:30 INFO  [discovery] [graylog2-server] graylog2/fUMzabR0Rk-rlztj1tDFTg
2016-03-14T04:27:13.033+05:30 INFO  [PeriodicalsService] Not starting [org.graylog2.periodical.UserPermissionMigrationPeriodical] periodical. Not configured to run on this node.
2016-03-14T04:27:13.033+05:30 INFO  [Periodicals] Starting [org.graylog2.periodical.AlarmCallbacksMigrationPeriodical] periodical, running forever.
2016-03-14T04:27:13.036+05:30 INFO  [Periodicals] Starting [org.graylog.plugins.usagestatistics.UsageStatsNodePeriodical] periodical in [300s], polling every [21600s].
2016-03-14T04:27:13.036+05:30 INFO  [Periodicals] Starting [org.graylog.plugins.usagestatistics.UsageStatsClusterPeriodical] periodical in [300s], polling every [21600s].
2016-03-14T04:27:15.995+05:30 WARN  [discovery] [graylog2-server] waited for 3s and no initial state was set by the discovery
2016-03-14T04:27:15.995+05:30 INFO  [node] [graylog2-server] started
2016-03-14T04:27:16.860+05:30 INFO  [RestApiService] Adding security context factory: <org.graylog2.security.ShiroSecurityContextFactory@1c59b0b>
2016-03-14T04:27:16.879+05:30 INFO  [RestApiService] Started REST API at <http://127.0.0.1:12900/>
2016-03-14T04:27:20.998+05:30 INFO  [IndexerSetupService] Checking Elasticsearch HTTP API at http://127.0.0.1:9200/
2016-03-14T04:27:21.208+05:30 ERROR [IndexerSetupService] Could not connect to Elasticsearch: Connection refused
2016-03-14T04:27:21.210+05:30 WARN  [IndexerSetupService] Could not connect to Elasticsearch
2016-03-14T04:27:21.210+05:30 INFO  [IndexerSetupService] If you're using multicast, check that it is working in your network and that Elasticsearch is accessible. Also check that the cluster name setting is correct.
2016-03-14T04:27:21.210+05:30 INFO  [IndexerSetupService] See http://docs.graylog.org/en/1.3/pages/configuring_es.html for details.
2016-03-14T04:27:21.211+05:30 INFO  [ServiceManagerListener] Services are healthy
2016-03-14T04:27:21.211+05:30 INFO  [InputSetupService] Triggering launching persisted inputs, node transitioned from Uninitialized [LB:DEAD] to Running [LB:ALIVE]
2016-03-14T04:27:21.212+05:30 INFO  [ServerBootstrap] Services started, startup times in ms: {InputSetupService [RUNNING]=10, KafkaJournal [RUNNING]=13, OutputSetupService [RUNNING]=23, BufferSynchronizerService [RUNNING]=24, DashboardRegistryService [RUNNING]=30, MetricsReporterService [RUNNING]=30, JournalReader [RUNNING]=74, PeriodicalsService [RUNNING]=481, RestApiService [RUNNING]=4321, IndexerSetupService [RUNNING]=8643}
2016-03-14T04:27:21.267+05:30 INFO  [ServerBootstrap] Graylog server up and running.
2016-03-14T04:27:21.271+05:30 INFO  [InputStateListener] Input [GELF TCP/56e08fbe327c4c10e139577c] is now STARTING
2016-03-14T04:27:21.300+05:30 INFO  [InputStateListener] Input [GELF TCP/56e08fbe327c4c10e139577c] is now RUNNING
2016-03-14T04:27:22.352+05:30 INFO  [AbstractValidatingSessionManager] Enabling session validation scheduler...
2016-03-14T04:27:27.601+05:30 INFO  [IndexRangesCleanupPeriodical] Skipping index range cleanup because the Elasticsearch cluster is unreachable or unhealthy
2016-03-14T04:28:12.590+05:30 INFO  [IndexerClusterCheckerThread] Indexer not fully initialized yet. Skipping periodic cluster check.
2016-03-14T04:28:42.593+05:30 INFO  [IndexerClusterCheckerThread] Indexer not fully initialized yet. Skipping periodic cluster check.
2016-03-14T04:29:12.595+05:30 INFO  [IndexerClusterCheckerThread] Indexer not fully initialized yet. Skipping periodic cluster check.
2016-03-14T04:29:42.597+05:30 INFO  [IndexerClusterCheckerThread] Indexer not fully initialized yet. Skipping periodic cluster check.
2016-03-14T04:30:12.600+05:30 INFO  [IndexerClusterCheckerThread] Indexer not fully initialized yet. Skipping periodic cluster check.

I am new to working with tools and linux environment. Please help me with this.

joschi commented 8 years ago

@mshar039 Are there any error messages in the logs of your Elasticsearch nodes?

mshar039 commented 8 years ago

@joschi Following is the log from /var/log/elasticsearch/graylog2.log

[2016-03-14 04:11:46,752][INFO ][node                     ] [Arkon] version[1.7.3], pid[31721], build[05d4530/2015-10-15T09:14:17Z]
[2016-03-14 04:11:46,758][INFO ][node                     ] [Arkon] initializing ...
[2016-03-14 04:11:46,903][INFO ][plugins                  ] [Arkon] loaded [suggest], sites []
[2016-03-14 04:11:46,934][INFO ][env                      ] [Arkon] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [39.5gb], net total_space [49.9gb], types [rootfs]
[2016-03-14 04:11:47,073][ERROR][bootstrap                ] Exception
java.lang.VerifyError: class org.elasticsearch.rest.action.suggest.SuggestAction overrides final method handleRequest.(Lorg/elasticsearch/rest/RestRequest;Lorg/elasticsearch/rest/RestChannel;)V
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at org.elasticsearch.plugin.suggest.SuggestPlugin.processModule(SuggestPlugin.java:27)
        at org.elasticsearch.plugins.PluginsService.processModule(PluginsService.java:193)
        at org.elasticsearch.plugins.PluginsModule.processModule(PluginsModule.java:61)
        at org.elasticsearch.common.inject.Modules.processModules(Modules.java:64)
        at org.elasticsearch.common.inject.ModulesBuilder.createInjector(ModulesBuilder.java:58)
        at org.elasticsearch.node.internal.InternalNode.<init>(InternalNode.java:210)
        at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:159)
        at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:77)
        at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:245)
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:32)
joschi commented 8 years ago

It looks like your Elasticsearch installation is corrupted. Reinstall Elasticsearch 1.7.5 from the official packages and try again.

mshar039 commented 8 years ago

@joschi I installed Elasticsearch 1.7.5 and tried again. Gave the following configurations:

in /etc/elasticsearch/elasticsearch.yml cluster.name: graylog2 node.master: true node.data: true bootstrap.mlockall: true network.host: 127.0.0.1 discovery.zen.ping.multicast.enabled: false discovery.zen.ping.unicast.hosts: ["127.0.0.1:9300"]

$ systemctl status elasticsearch.service elasticsearch.service - Elasticsearch Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2016-03-15 00:27:10 IST; 3min 11s ago Docs: http://www.elastic.co Process: 14689 ExecStart=/usr/share/elasticsearch/bin/elasticsearch -Des.pidfile=$PID_DIR/elasticsearch.pid -Des.default.path.home=$ES_HOME -Des.default.path.logs=$LOG_DIR -Des.default.path.data=$DATA_DIR -Des.default.config=$CONF_FILE -Des.default.path.conf=$CONF_DIR (code=exited, status=3) Main PID: 14689 (code=exited, status=3)

Mar 15 00:27:08 localhost.localdomain systemd[1]: Started Elasticsearch. Mar 15 00:27:08 localhost.localdomain systemd[1]: Starting Elasticsearch... Mar 15 00:27:10 localhost.localdomain elasticsearch[14689]: {1.7.3}: Initialization Failed ... Mar 15 00:27:10 localhost.localdomain elasticsearch[14689]: 1) ElasticsearchException Mar 15 00:27:10 localhost.localdomain elasticsearch[14689]: NullPointerException2) ElasticsearchException[unexpected field in shard state [index_uuid]] Mar 15 00:27:10 localhost.localdomain elasticsearch[14689]: CorruptStateException[unexpected field in shard state [index_uuid]] Mar 15 00:27:10 localhost.localdomain systemd[1]: elasticsearch.service: main process exited, code=exited, status=3/NOTIMPLEMENTED Mar 15 00:27:10 localhost.localdomain systemd[1]: Unit elasticsearch.service entered failed state. Mar 15 00:27:10 localhost.localdomain systemd[1]: elasticsearch.service failed.

Please help.

joschi commented 8 years ago

@mshar039 According to the logs you've posted, you're still running a corrupted version of Elasticsearch 1.7.3.

commandline-be commented 4 years ago

Same issue with ES-oss 6.8.10