camunda / camunda-platform

Links to Camunda Platform 8 resources, releases, and local development config
349 stars 257 forks source link

ElasticSearch unhealthy and keycloak stops #27

Closed dennisfabri closed 1 year ago

dennisfabri commented 2 years ago

I am using Ubuntu 22.04 with the docker-repo and everything patched to todays available versions. After cloning the repo and running docker-compose I always get an unhealthy elasticsearch container and the keycloak container simply stops.

keycloak seems to try a connection to a mssql server:

Added 'admin' to '/opt/jboss/keycloak/standalone/configuration/keycloak-add-user.json', restart server to load user
-b 0.0.0.0
=========================================================================
  Using Microsoft SQL Server database
=========================================================================
16:13:38,515 INFO  [org.jboss.modules] (CLI command executor) JBoss Modules version 2.0.0.Final

which then leads to this exception (shortened for readability):

16:15:14,145 WARN  [org.jboss.jca.core.connectionmanager.pool.strategy.OnePool] (ServerService Thread Pool -- 58) IJ000604: Throwable while attempting to get a new connection: null: javax.resource.ResourceException: IJ031084: Unable to create connection
    at org.jboss.ironjacamar.jdbcadapters@1.5.3.Final//org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.createLocalManagedConnection(LocalManagedConnectionFactory.java:364)
    at org.jboss.ironjacamar.jdbcadapters@1.5.3.Final//org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.getLocalManagedConnection(LocalManagedConnectionFactory.java:371)
    at org.jboss.ironjacamar.jdbcadapters@1.5.3.Final//org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.createManagedConnection(LocalManagedConnectionFactory.java:287)
[...]
org.jboss.threads@2.4.0.Final//org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
    at org.jboss.threads@2.4.0.Final//org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1990)
    at org.jboss.threads@2.4.0.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
    at org.jboss.threads@2.4.0.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1377)
    at java.base/java.lang.Thread.run(Thread.java:829)
    at org.jboss.threads@2.4.0.Final//org.jboss.threads.JBossThread.run(JBossThread.java:513)
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The TCP/IP connection to the host mssql, port 1433 has failed. Error: "connect timed out. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.".
    at com.microsoft.sqlserver.jdbc//com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:234)
[...]
    at com.microsoft.sqlserver.jdbc//com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:825)
    at org.jboss.ironjacamar.jdbcadapters@1.5.3.Final//org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.createLocalManagedConnection(LocalManagedConnectionFactory.java:335)

ElasticSearch gives me messages like these:

{"type": "server", "timestamp": "2022-08-18T16:15:03,319Z", "level": "ERROR", "component": "o.e.x.m.e.l.LocalExporter", "cluster.name": "docker-cluster", "node.name": "af19f3f6cf1e", "message": "failed to set monitoring pipeline [xpack_monitoring_6]", "cluster.uuid": "1TPkbz_kT-OIUSXTWWWnIg", "node.id": "JVKssa2YReSQYPlVs2Z-WQ" , 
"stacktrace": ["org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (put-pipeline-xpack_monitoring_6) within 30s",
"at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$0(MasterService.java:158) [elasticsearch-7.17.0.jar:7.17.0]",
"at java.util.ArrayList.forEach(ArrayList.java:1511) [?:?]",
"at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$1(MasterService.java:157) [elasticsearch-7.17.0.jar:7.17.0]",
"at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:718) [elasticsearch-7.17.0.jar:7.17.0]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]",
"at java.lang.Thread.run(Thread.java:833) [?:?]"] }
jessesimpson36 commented 1 year ago

I think we've fixed this already, and we have already merged a PR to better detect when services fail to report a (healthy) status.

https://github.com/camunda/camunda-platform/pull/132