Closed bahui80 closed 3 years ago
You can get more control by configuring a TimeoutSource
(via TimeoutOptions
and ClientOptions
). That interface is used to determine the per-command timeout. You can use a trigger or specific commands to give these a higher timeout (like the CLUSTER
and CLIENT
commands) and default to 100ms for all other ones.
Hi thanks for the answer. Yes I read about the TimeoutOptions
but what i don't understand with my current configuration is what the timeout in the RedisURI
is used for then:.withTimeout(redisConfigurationProperties.getCommandTimeout())
now.
And also the .connectTimeout(redisConfigurationProperties.getConnectionTimeout())
in the SocketOptions.
Is my problem now that for the initial connection a command
is issued to retrieve the cluster partitions (so the connection itself triggers commands for connecting)? And since my command timeout is 100ms for every command, they are two small to retrieve cluster partitions?
If I specify the TimeoutSource
, then I should remove those two (socket timeout and command timeout on redis uri) and handle all timeouts logic via there?
RedisURI.timeout
was first and that timeout was used for all sorts of cases (connect timeout, command timeout, SSL handshake timeout, …). Without configuring TimeoutOptions
, RedisURI.timeout
applies to commands.
Ok great thanks for the explanation.
So this statement is true right:
Is my problem now that for the initial connection a command is issued to retrieve the cluster partitions (so the connection itself triggers commands for connecting)? And since my command timeout is 100ms for every command, they are two small to retrieve cluster partitions?
Initial connection triggers some commands to be able to connect.
The stack trace is a bit short and I didn't get yet a chance to reproduce the issue. Maybe I'll find some time next week or so. If I had to guess, I would say it's the PING
or HELLO
command that verifies the connection handshake.
Ok great, just in case, I did remove the pingBeforeActivateConnection
config, so currently it is set to false.
I removed that one in fact because by using that one (prior understanding what you just explained), having a small command timeout make the initial ping fail almost every time. That was why I removed the ping option
Ok I did a little bit more digging into your proposal. I added to my ClusterClientOptions
object the following:
.timeoutOptions(TimeoutOptions.builder()
.timeoutSource(new Saraza())
.build());
That class is defined as following:
public class Saraza extends TimeoutOptions.TimeoutSource {
@Override
public long getTimeout(RedisCommand<?, ?, ?> command) {
if (command.getType().equals(CommandType.CLIENT) || command.getType().equals(CommandType.CLUSTER) || command.getType().equals(CommandType.INFO)) {
return 60000;
}
return redisConfigurationProperties.getCommandTimeout().toMillis();
}
}
So also with some integration tests I have (that use TestContainer to run Redis cluster), i debugged them and noticed that the three commands called during the connection were: CLIENT
, CLUSTER
and INFO
.
I set my redisConfigurationProperties.getCommandTimeout()
to 10 milliseconds on purpouse to make my tests fail.
Even though I added the timeoutOptions
and specified that the initial connection commands have a longer command timeout, it is still failing to connect. I can share the full stack trace:
java.lang.IllegalStateException: Failed to load ApplicationContext
at org.springframework.test.context.cache.DefaultCacheAwareContextLoaderDelegate.loadContext(DefaultCacheAwareContextLoaderDelegate.java:124)
at org.springframework.test.context.support.DefaultTestContext.getApplicationContext(DefaultTestContext.java:83)
at org.springframework.test.context.support.DependencyInjectionTestExecutionListener.injectDependencies(DependencyInjectionTestExecutionListener.java:117)
at org.springframework.test.context.support.DependencyInjectionTestExecutionListener.prepareTestInstance(DependencyInjectionTestExecutionListener.java:83)
at org.springframework.boot.test.autoconfigure.SpringBootDependencyInjectionTestExecutionListener.prepareTestInstance(SpringBootDependencyInjectionTestExecutionListener.java:43)
at org.springframework.test.context.TestContextManager.prepareTestInstance(TestContextManager.java:230)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.createTest(SpringJUnit4ClassRunner.java:228)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner$1.runReflectiveCall(SpringJUnit4ClassRunner.java:287)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.methodBlock(SpringJUnit4ClassRunner.java:289)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:247)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:94)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate(RunBeforeTestClassCallbacks.java:61)
at org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate(RunAfterTestClassCallbacks.java:70)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run(SpringJUnit4ClassRunner.java:191)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:220)
at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53)
Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'createRedisConnection' defined in class path resource [com/testing/runtime/services/redis/autoconfigure/RedisAutoConfiguration.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [io.lettuce.core.cluster.api.StatefulRedisClusterConnection]: Factory method 'createRedisConnection' threw exception; nested exception is io.lettuce.core.RedisConnectionException: Unable to establish a connection to Redis Cluster
at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:599)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(AbstractAutowireCapableBeanFactory.java:1177)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1072)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:511)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:481)
at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:312)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:230)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:308)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:197)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:756)
at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:867)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:542)
at org.springframework.boot.context.embedded.EmbeddedWebApplicationContext.refresh(EmbeddedWebApplicationContext.java:123)
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:666)
at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:353)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:300)
at org.springframework.boot.test.context.SpringBootContextLoader.loadContext(SpringBootContextLoader.java:118)
at org.springframework.test.context.cache.DefaultCacheAwareContextLoaderDelegate.loadContextInternal(DefaultCacheAwareContextLoaderDelegate.java:98)
at org.springframework.test.context.cache.DefaultCacheAwareContextLoaderDelegate.loadContext(DefaultCacheAwareContextLoaderDelegate.java:116)
... 25 more
Caused by: org.springframework.beans.BeanInstantiationException: Failed to instantiate [io.lettuce.core.cluster.api.StatefulRedisClusterConnection]: Factory method 'createRedisConnection' threw exception; nested exception is io.lettuce.core.RedisConnectionException: Unable to establish a connection to Redis Cluster
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:189)
at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:588)
... 43 more
Caused by: io.lettuce.core.RedisConnectionException: Unable to establish a connection to Redis Cluster
at io.lettuce.core.cluster.RedisClusterClient.lambda$assertInitialPartitions$32(RedisClusterClient.java:867)
at io.lettuce.core.cluster.RedisClusterClient.get(RedisClusterClient.java:886)
at io.lettuce.core.cluster.RedisClusterClient.assertInitialPartitions(RedisClusterClient.java:866)
at io.lettuce.core.cluster.RedisClusterClient.connect(RedisClusterClient.java:389)
at io.lettuce.core.cluster.RedisClusterClient.connect(RedisClusterClient.java:366)
at com.testing.runtime.services.redis.autoconfigure.RedisAutoConfiguration.createRedisConnection(RedisAutoConfiguration.java:124)
at com.testing.runtime.services.redis.autoconfigure.RedisAutoConfiguration$$EnhancerBySpringCGLIB$$c4ae6dbc.CGLIB$createRedisConnection$3(<generated>)
at com.testing.runtime.services.redis.autoconfigure.RedisAutoConfiguration$$EnhancerBySpringCGLIB$$c4ae6dbc$$FastClassBySpringCGLIB$$72fa88e3.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228)
at org.springframework.context.annotation.ConfigurationClassEnhancer$BeanMethodInterceptor.intercept(ConfigurationClassEnhancer.java:358)
at com.testing.runtime.services.redis.autoconfigure.RedisAutoConfiguration$$EnhancerBySpringCGLIB$$c4ae6dbc.createRedisConnection(<generated>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:162)
... 44 more
Caused by: io.lettuce.core.cluster.topology.DefaultClusterTopologyRefresh$CannotRetrieveClusterPartitions: Cannot retrieve cluster partitions from [redis://************************************@localhost:57322?clientName=myClientName&timeout=10000000ns]
Details:
[redis://************************************@localhost:57322?clientName=myClientName&timeout=10000000ns]: Command timed out after 10 millisecond(s)
Suppressed: io.lettuce.core.RedisCommandTimeoutException: Command timed out after 10 millisecond(s)
at io.lettuce.core.internal.ExceptionFactory.createTimeoutException(ExceptionFactory.java:53)
at io.lettuce.core.cluster.topology.Connections.lambda$doRequest$2(Connections.java:146)
at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:672)
at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:747)
at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:472)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
For local tests I am not using SSL so its a simple connection.
Since there isn't anything we can do here, closing the ticket. Setting timeouts that lead to handshake timeouts isn't something that we can address.
Description
I am implementing a rate limiter library using a Redis cluster. The main idea is to make a distributed rate limit across instances. Since we will be handling lots of requests and our system nowadays is response time sensitive, we want to fail fast if Redis is down somehow (connection lost, any error, etc).
So what we want basically is have a pretty small
command timeout
and we dont care about the connection timeout since the initial connection when the app starts can take long.So our code basically does the rate limiting inside a big
The main goal of this is that if redis fails or any of the rate limit logic fails, we let the request pass. We prefer to let more requests pass than failing or slowing requests since this service is pretty response time sensitive and crutial so we cant fail requests (but we still need to protect ourselves).
Issue
Now the issue is that when we set a really small command timeout lettuce fails to connect to Redis. This is our current configuration for the Redis connection (all Beans in Spring Boot):
Client Resources
Cluster Client Options
Redis Cluster Client
There are some configs here. Those configs are spring boot configs. But from here two are quite important:
At first we had 1.5 seconds and 3 seconds and everything was working (it was for development only). But now that we want to go to prod and move the command timeout to 100 milliseconds the app is not coming up.
And we clearly see that the initial connection fails:
I obfuscated urls and client names.
Is it possible to set such a smaller commandTimeout and succeed in initial connection? Also, why command timeout is used for the initial connection? I assume it should only be using the connect timeout from the socket options