orkes-io / orkes-conductor-community

Orkes Conductor is a microservices orchestration engine.
Other
111 stars 27 forks source link

Conductor.db.type of redis_sentinel is being ignored #16

Open RussellTaylor83 opened 1 year ago

RussellTaylor83 commented 1 year ago

I'm trying to run the orkes server, pointing it to a redis sentinel cluster. I am mounting the following in /app/config/config.properties

    spring.datasource.url=jdbc:postgresql://postgres:5432/postgres

    spring.datasource.username=postgres

    spring.datasource.password=postgres

    conductor.db.type=redis_sentinel

    conductor.redis-lock.serverAddress=redis://redis:26379

    conductor.redis.hosts=redis:26379:this-one

Below is the output from orkes, grepping for the wording redis:

10:32:07.812 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT_6379_TCP_PROTO, Value: tcp

10:32:07.812 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT_6379_TCP_ADDR, Value: 10.43.51.71

10:32:07.812 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT, Value: tcp://10.43.51.71:6379

10:32:07.813 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_SERVICE_PORT_TCP_SENTINEL, Value: 26379

10:32:07.813 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT_26379_TCP, Value: tcp://10.43.51.71:26379

10:32:07.814 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT_26379_TCP_ADDR, Value: 10.43.51.71

10:32:07.814 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT_26379_TCP_PORT, Value: 26379

10:32:07.814 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_SERVICE_HOST, Value: 10.43.51.71

10:32:07.815 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_SERVICE_PORT_TCP_REDIS, Value: 6379

10:32:07.815 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT_26379_TCP_PROTO, Value: tcp

10:32:07.815 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT_6379_TCP, Value: tcp://10.43.51.71:6379

10:32:07.815 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_PORT_6379_TCP_PORT, Value: 6379

10:32:07.816 [main] INFO io.orkes.conductor.OrkesConductorApplication - System Env Props - Key: REDIS_SERVICE_PORT, Value: 6379

10:32:07.832 [main] INFO io.orkes.conductor.OrkesConductorApplication - Setting conductor.redis-lock.serverAddress - redis://redis:26379

10:32:07.832 [main] INFO io.orkes.conductor.OrkesConductorApplication - Setting conductor.db.type - redis_sentinel

10:32:07.832 [main] INFO io.orkes.conductor.OrkesConductorApplication - Setting conductor.redis.hosts - redis:26379:this-one

ESC[30m2023-02-21 10:32:18,624ESC[0;39m ESC[34mINFO ESC[0;39m [ESC[34mmainESC[0;39m] ESC[33mio.orkes.conductor.queue.config.RedisQueueConfigurationESC[0;39m: Starting conductor server using redis_standalone - use SSL? false

ESC[30m2023-02-21 10:32:19,055ESC[0;39m ESC[1;31mERRORESC[0;39m [ESC[34mmainESC[0;39m] ESC[33mcom.netflix.conductor.redis.dao.RedisMetadataDAOESC[0;39m: refresh TaskDefs failed

redis.clients.jedis.exceptions.JedisDataException: ERR unknown command `HSCAN`, with args beginning with: `conductor.test.TASK_DEFS`, `0`,

        at redis.clients.jedis.Protocol.processError(Protocol.java:135)

        at redis.clients.jedis.Protocol.process(Protocol.java:169)

        at redis.clients.jedis.Protocol.read(Protocol.java:223)

        at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:352)

        at redis.clients.jedis.Connection.getUnflushedObjectMultiBulkReply(Connection.java:314)

        at redis.clients.jedis.Connection.getObjectMultiBulkReply(Connection.java:319)

        at redis.clients.jedis.Jedis.hscan(Jedis.java:3727)

        at redis.clients.jedis.Jedis.hscan(Jedis.java:3719)

        at com.netflix.conductor.redis.jedis.JedisStandalone.lambda$hscan$127(JedisStandalone.java:706)

        at com.netflix.conductor.redis.jedis.JedisStandalone.executeInJedis(JedisStandalone.java:59)

        at com.netflix.conductor.redis.jedis.JedisStandalone.hscan(JedisStandalone.java:706)

        at com.netflix.conductor.redis.jedis.OrkesJedisProxy.hgetAll(OrkesJedisProxy.java:148)

        at com.netflix.conductor.redis.dao.RedisMetadataDAO.getAllTaskDefs(RedisMetadataDAO.java:125)

        at com.netflix.conductor.redis.dao.RedisMetadataDAO.refreshTaskDefs(RedisMetadataDAO.java:92)

        at com.netflix.conductor.redis.dao.RedisMetadataDAO.<init>(RedisMetadataDAO.java:57)

        at com.netflix.conductor.redis.dao.OrkesMetadataDAO.<init>(OrkesMetadataDAO.java:55)

ESC[30m2023-02-21 10:32:19,059ESC[0;39m ESC[34mINFO ESC[0;39m [ESC[34mmainESC[0;39m] ESC[33mcom.netflix.conductor.redis.dao.OrkesMetadataDAOESC[0;39m: taskDefCacheTTL set to 1000

So I believe I am loading the config fine, but it's still running the standalone Redis configuration. It looks like the default config, despite me attempting to override it and the output messages suggesting I've done that, is still take precedence?

Any ideas please?

manan164 commented 1 year ago

Hi @RussellTaylor83 , Thanks for reporting this. As per the configuration, this looks fine. We will check this locally and update you. Let me know if this works, or we can chat here for more realtime collaboration.

RussellTaylor83 commented 1 year ago

I've joined your slack to see if I can help with diagnosing this / testing fixes.

Many thanks

v1r3n commented 1 year ago

I see this in the logs:

redis.clients.jedis.exceptions.JedisDataException: ERR unknown command `HSCAN`, with args beginning with: `conductor.test.TASK_DEFS`, `0`,

What version of Redis are you using?

RussellTaylor83 commented 1 year ago

6.2.6, which I thought would be why I don't have HSCAN. We use bitnami helm charts to deploy it so it should be up to date.

I don't know though if this is a symptom of the config not being loaded, or the cause of the issue.

RussellTaylor83 commented 1 year ago

I don't know if it's related, but is sentinel actually supported? Having just read https://orkes.io/content/docs/getting-started/install/orkes-conductor-community it says:

Orkes-Queues - Redis-based queues that improve upon dyno-queues and providers higher performance and are built from the ground up to support Redis standalone and cluster mode

RussellTaylor83 commented 1 year ago

I've had to deploy a redis cluster for now.

Has anyone got sentinel working with orkes please?

RussellTaylor83 commented 1 year ago

If I set redis.queue.type to redis_cluster as per below:

conductor.db.type=redis_cluster
conductor.queue.type=redis_cluster
conductor.redis-lock.serverAddress=redis://redis-cluster:6379
conductor.redis.hosts=redis-cluster:6379:this-one

I get this error:

15:54:05.592 [main] INFO io.orkes.conductor.OrkesConductorApplication - Setting conductor.redis-lock.serverAddress - redis://redis-cluster:6379
15:54:05.592 [main] INFO io.orkes.conductor.OrkesConductorApplication - Setting spring.datasource.username - conductor
15:54:05.593 [main] INFO io.orkes.conductor.OrkesConductorApplication - Setting conductor.db.type - redis_cluster
15:54:05.593 [main] INFO io.orkes.conductor.OrkesConductorApplication - Setting spring.datasource.url - jdbc:postgresql://postgresql-master:5432/conductor
15:54:05.593 [main] INFO io.orkes.conductor.OrkesConductorApplication - Setting conductor.redis.hosts - redis-cluster:6379:this-one
15:54:05.593 [main] INFO io.orkes.conductor.OrkesConductorApplication - Setting spring.datasource.password - conductor
15:54:05.593 [main] INFO io.orkes.conductor.OrkesConductorApplication - Setting conductor.queue.type - redis_cluster
15:54:05.593 [main] INFO io.orkes.conductor.OrkesConductorApplication - Loaded 7 properties from /app/config/config.properties
15:54:05.593 [main] INFO io.orkes.conductor.OrkesConductorApplication - Completed loading external configuration

  ______   .______       __  ___  _______     _______.
 /  __  \  |   _  \     |  |/  / |   ____|   /       |
|  |  |  | |  |_)  |    |  '  /  |  |__     |   (----`
|  |  |  | |      /     |    <   |   __|     \   \
|  `--'  | |  |\  \----.|  .  \  |  |____.----)   |
 \______/  | _| `._____||__|\__\ |_______|_______/

  ______   ______   .__   __.  _______   __    __    ______ .___________.  ______   .______
 /      | /  __  \  |  \ |  | |       \ |  |  |  |  /      ||           | /  __  \  |   _  \
|  ,----'|  |  |  | |   \|  | |  .--.  ||  |  |  | |  ,----'`---|  |----`|  |  |  | |  |_)  |
|  |     |  |  |  | |  . `  | |  |  |  ||  |  |  | |  |         |  |     |  |  |  | |      /
|  `----.|  `--'  | |  |\   | |  '--'  ||  `--'  | |  `----.    |  |     |  `--'  | |  |\  \----.
 \______| \______/  |__| \__| |_______/  \______/   \______|    |__|      \______/  | _| `._____|

Licensed under Orkes Conductor Community License

2023-03-21 15:54:08,210 INFO  [main] org.springframework.boot.StartupInfoLogger: Starting OrkesConductorApplication v1.04 using Java 11.0.18 on conductor-58cbbcc55-6ndgg with PID 18 (/app/libs/server.jar started by root in /app/libs)
2023-03-21 15:54:08,222 INFO  [main] org.springframework.boot.SpringApplication: No active profile set, falling back to default profiles: default
wget: can't connect to remote host: Connection refused
2023-03-21 15:54:14,229 INFO  [main] org.springframework.data.repository.config.RepositoryConfigurationDelegate: Multiple Spring Data modules found, entering strict repository configuration mode!
2023-03-21 15:54:14,234 INFO  [main] org.springframework.data.repository.config.RepositoryConfigurationDelegate: Bootstrapping Spring Data JPA repositories in DEFAULT mode.
2023-03-21 15:54:14,282 INFO  [main] org.springframework.data.repository.config.RepositoryConfigurationDelegate: Finished Spring Data repository scanning in 12 ms. Found 0 JPA repository interfaces.
wget: can't connect to remote host: Connection refused
2023-03-21 15:54:16,127 INFO  [main] org.springframework.boot.web.embedded.tomcat.TomcatWebServer: Tomcat initialized with port(s): 8080 (http)
2023-03-21 15:54:16,146 INFO  [main] org.apache.juli.logging.DirectJDKLog: Initializing ProtocolHandler ["http-nio-8080"]
2023-03-21 15:54:16,147 INFO  [main] org.apache.juli.logging.DirectJDKLog: Starting service [Tomcat]
2023-03-21 15:54:16,148 INFO  [main] org.apache.juli.logging.DirectJDKLog: Starting Servlet engine: [Apache Tomcat/9.0.55]
2023-03-21 15:54:16,257 INFO  [main] org.apache.juli.logging.DirectJDKLog: Initializing Spring embedded WebApplicationContext
2023-03-21 15:54:16,257 INFO  [main] org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext: Root WebApplicationContext: initialization completed in 7735 ms
2023-03-21 15:54:16,791 WARN  [main] org.springframework.context.support.AbstractApplicationContext: Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'orkesConductorApplication': Unsatisfied dependency expressed through field 'metadataDAO'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'orkesMetadataDAO' defined in URL [jar:file:/app/libs/server.jar!/BOOT-INF/lib/orkes-conductor-persistence-1.04.jar!/com/netflix/conductor/redis/dao/OrkesMetadataDAO.class]: Unsatisfied dependency expressed through constructor parameter 0; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'orkesJedisProxy' defined in URL [jar:file:/app/libs/server.jar!/BOOT-INF/lib/orkes-conductor-persistence-1.04.jar!/com/netflix/conductor/redis/jedis/OrkesJedisProxy.class]: Unsatisfied dependency expressed through constructor parameter 0; nested exception is org.springframework.beans.factory.NoSuchBeanDefinitionException: No qualifying bean of type 'redis.clients.jedis.JedisPool' available: expected at least 1 bean which qualifies as autowire candidate. Dependency annotations: {}
2023-03-21 15:54:16,797 INFO  [main] org.apache.juli.logging.DirectJDKLog: Stopping service [Tomcat]
2023-03-21 15:54:16,833 INFO  [main] org.springframework.boot.autoconfigure.logging.ConditionEvaluationReportLoggingListener: 

Error starting ApplicationContext. To display the conditions report re-run your application with 'debug' enabled.
2023-03-21 15:54:16,894 ERROR [main] org.springframework.boot.diagnostics.LoggingFailureAnalysisReporter: 

***************************
APPLICATION FAILED TO START
***************************

Description:

Parameter 0 of constructor in com.netflix.conductor.redis.jedis.OrkesJedisProxy required a bean of type 'redis.clients.jedis.JedisPool' that could not be found.

The injection point has the following annotations:
        - @org.springframework.beans.factory.annotation.Autowired(required=true)

Action:

Consider defining a bean of type 'redis.clients.jedis.JedisPool' in your configuration.

It's the line conductor.queue.type that does it, which I took from https://github.com/orkes-io/orkes-conductor-community/blob/3c89669169e49b3f403608afe55d94f83a6072f9/server/src/main/resources/application.properties

Even without that line (presuming it's set to standalone) I'm struggling to get orkes community to run with any version of redis that isn't standalone. How are people doing it please?

RussellTaylor83 commented 1 year ago

In fact I can get it working, with this config ...

conductor.db.type=redis_cluster
conductor.redis-lock.serverAddress=redis://redis-cluster:6379
conductor.redis.hosts=redis-cluster:6379:this-one

But it throws this exception:

C[30m2023-03-21 16:28:11,074ESC[0;39m ESC[31mWARN ESC[0;39m [ESC[34msweeper-thread-3ESC[0;39m] ESC[33mio.orkes.conductor.mq.redis.QueueMonitorESC[0;39m: MOVED 15334 10.42.0.92:6379
redis.clients.jedis.exceptions.JedisMovedDataException: MOVED 15334 10.42.0.92:6379
        at redis.clients.jedis.Protocol.processError(Protocol.java:119)
        at redis.clients.jedis.Protocol.process(Protocol.java:169)
        at redis.clients.jedis.Protocol.read(Protocol.java:223)
        at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:352)
        at redis.clients.jedis.Connection.getOne(Connection.java:330)
        at redis.clients.jedis.Jedis.evalsha(Jedis.java:3304)
        at redis.clients.jedis.Jedis.evalsha(Jedis.java:3297)
        at io.orkes.conductor.mq.redis.single.RedisQueueMonitor.pollMessages(RedisQueueMonitor.java:46)
        at io.orkes.conductor.mq.redis.QueueMonitor.__peekedMessages(QueueMonitor.java:130)
        at io.orkes.conductor.mq.redis.QueueMonitor.pop(QueueMonitor.java:61)
        at io.orkes.conductor.mq.redis.single.ConductorRedisQueue.pop(ConductorRedisQueue.java:61)
        at io.orkes.conductor.queue.dao.BaseRedisQueueDAO.pop(BaseRedisQueueDAO.java:108)
        at io.orkes.conductor.server.service.OrkesWorkflowSweeper.pollAndSweep(OrkesWorkflowSweeper.java:91)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)

which is because Jedis is configured to run as standalone!

Has anyone got this working with something that isn't standalone???

arketec commented 1 year ago

I am actually experiencing the exact same issue. I have confirmed everything above in my own local environment. Forgive me if I am using the terminology incorrectly (I am not much of a Java dev), but it seems there is a Spring dependency on JedisPool but there are no beans that can provide JedisPool when using anything but redis_standalone. I am not familiar enough with Spring to know the reason for this,

simonmacklin commented 1 year ago

Did anyone manage to get redis working with anything but the standalone config? I was hoping to test this using redis_cluster mode

macca2317 commented 1 year ago

I have made some changes to allow us to run in Redis Cluster/Sentinel - seems to work now for us. https://github.com/orkes-io/orkes-conductor-community/pull/23