Azure / aspnet-redis-providers

ASP.NET Redis Providers
Other
268 stars 180 forks source link

RedisServerException: READONLY You can't write against a read only slave. #22

Closed jeffz29 closed 8 years ago

jeffz29 commented 8 years ago

I get the above error occasionally in my all environments.

I have to restart the site whenever this happens. I check my redis nodes and they did not elect a new master. Has anyone seen this error before?

Exception Message: ERR Error running script (call to f_71fe2e7962348b06aa8ce3e244cdb3f774b4f549): @user_script:4: @user_script: 4: -READONLY You can't write against a read only slave.

Exception StackTrack: at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message message, ResultProcessor1 processor, ServerEndPoint server) in c:\TeamCity\buildAgent\work\3ae0647004edff78\StackExchange.Redis\StackExchange\Redis\ConnectionMultiplexer.cs:line 1835 at StackExchange.Redis.RedisBase.ExecuteSync[T](Message message, ResultProcessor1 processor, ServerEndPoint server) in c:\TeamCity\buildAgent\work\3ae0647004edff78\StackExchange.Redis\StackExchange\Redis\RedisBase.cs:line 80 at StackExchange.Redis.RedisDatabase.ScriptEvaluate(String script, RedisKey[] keys, RedisValue[] values, CommandFlags flags) in c:\TeamCity\buildAgent\work\3ae0647004edff78\StackExchange.Redis\StackExchange\Redis\RedisDatabase.cs:line 875 at Microsoft.Web.Redis.StackExchangeClientConnection.<>cDisplayClass4.b3() in c:\TeamCity\buildAgent\work\f55792526e6d9089\src\Shared\StackExchangeClientConnection.cs:line 113 at Microsoft.Web.Redis.StackExchangeClientConnection.RetryForScriptNotFound(Func1 redisOperation) in c:\TeamCity\buildAgent\work\f55792526e6d9089\src\Shared\StackExchangeClientConnection.cs:line 131 at Microsoft.Web.Redis.StackExchangeClientConnection.RetryLogic(Func1 redisOperation) in c:\TeamCity\buildAgent\work\f55792526e6d9089\src\Shared\StackExchangeClientConnection.cs:line 155 at Microsoft.Web.Redis.StackExchangeClientConnection.Eval(String script, String[] keyArgs, Object[] valueArgs) in c:\TeamCity\buildAgent\work\f55792526e6d9089\src\Shared\StackExchangeClientConnection.cs:line 113 at Microsoft.Web.Redis.RedisConnectionWrapper.TryTakeWriteLockAndGetData(DateTime lockTime, Int32 lockTimeout, Object& lockId, ISessionStateItemCollection& data, Int32& sessionTimeout) in c:\TeamCity\buildAgent\work\f55792526e6d9089\src\RedisSessionStateProvider\RedisConnectionWrapper.cs:line 182 at Microsoft.Web.Redis.RedisSessionStateProvider.GetItemFromSessionStore(Boolean isWriteLockRequired, HttpContext context, String id, Boolean& locked, TimeSpan& lockAge, Object& lockId, SessionStateActions& actions) in c:\TeamCity\buildAgent\work\f55792526e6d9089\src\RedisSessionStateProvider\RedisSessionStateProvider.cs:line 272 at Microsoft.Web.Redis.RedisSessionStateProvider.GetItemExclusive(HttpContext context, String id, Boolean& locked, TimeSpan& lockAge, Object& lockId, SessionStateActions& actions) in c:\TeamCity\buildAgent\work\f55792526e6d9089\src\RedisSessionStateProvider\RedisSessionStateProvider.cs:line 190 at System.Web.SessionState.SessionStateModule.GetSessionStateItem() at System.Web.SessionState.SessionStateModule.PollLockedSessionCallback(Object state)

SiddharthChatrolaMs commented 8 years ago

Are you seeing this with Azure Redis Cache or locally?

jeffz29 commented 8 years ago

I have a redis cluster on 3 linux boxes, 1 master and 2 slaves. The logs do not show that a new master is elected. The same server is master before and after the error occurs. When I restart the site, it works again. I modified the StackExchangeClientConnection to reconnect when it see's a READONLY error. I will test tonight with the Cluster failover command to see if this fixes it.

wakm commented 8 years ago

We experience similar errors like this (using RedisSessionStateProvider1.6.5) running a simple redis master-slave setup (1 master, 2 slaves) with HA-sentinel support.

see our exception: Exception Type: StackExchange.Redis.RedisServerException Exception Message: ERR Error running script (call to f_e6de9949d13e6886124297b8a995b16a2738524a): @user_script:14: @user_script: 14: -READONLY You can't write against a read only slave. Exception Source: StackExchange.Redis.StrongName Exception Target Site: T ExecuteSyncImpl[T](StackExchange.Redis.Message, StackExchange.Redis.ResultProcessor1[T], StackExchange.Redis.ServerEndPoint) Assembly Version: v1.0.0.0 Assembly Build Date: 2016-03-18 09:40:18 ---- Stack Trace ---- StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl(message As Message, processor As ResultProcessor1, server As ServerEndPoint) : line 0000, col 00, IL 0021 StackExchange.Redis.RedisBase.ExecuteSync(message As Message, processor As ResultProcessor1, server As ServerEndPoint) : line 0000, col 00, IL 0034 StackExchange.Redis.RedisDatabase.ScriptEvaluate(script As String, keys As RedisKey[], values As RedisValue[], flags As CommandFlags) : line 0000, col 00, IL 0060 Microsoft.Web.Redis.<>cDisplayClass4.b3() : line 0000, col 00, IL 0030 Microsoft.Web.Redis.StackExchangeClientConnection.RetryForScriptNotFound(redisOperation As Func1) : line 0000, col 00, IL 0039 Microsoft.Web.Redis.StackExchangeClientConnection.RetryLogic(redisOperation As Func1) : line 0000, col 00, IL 0054 Microsoft.Web.Redis.StackExchangeClientConnection.Eval(script As String, keyArgs As String[], valueArgs As Object[]) : line 0000, col 00, IL 0233 Microsoft.Web.Redis.RedisConnectionWrapper.TryCheckWriteLockAndGetData(lockId As Object&, data As ISessionStateItemCollection&, sessionTimeout As Int32&) : line 0000, col 00, IL 0065 Microsoft.Web.Redis.RedisSessionStateProvider.GetItemFromSessionStore(isWriteLockRequired As Boolean, context As HttpContext, id As String, locked As Boolean&, lockAge As TimeSpan&, lockId As Object&, actions As SessionStateActions&) : line 0000, col 00, IL 0436 Microsoft.Web.Redis.RedisSessionStateProvider.GetItem(context As HttpContext, id As String, locked As Boolean&, lockAge As TimeSpan&, lockId As Object&, actions As SessionStateActions&) : line 0000, col 00, IL 0047 System.Web.SessionState.SessionStateModule.GetSessionStateItem() : line 0000, col 00, IL 0050 System.Web.SessionState.SessionStateModule.BeginAcquireState(source As Object, e As EventArgs, cb As AsyncCallback, extraData As Object) : line 0000, col 00, IL 0431 System.Web.AsyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() : line 0000, col 00, IL 0243`

@jeffz29 were you able to overcome this issue? and what steps did you perform?

jeffz29 commented 8 years ago

I've renamed the RetryForScriptNotFound method in the StackExchangeClientConnection class to RetryForCatchableExceptions. I have not experienced the error since then, and don't see in the error in my logs. I will be pushing this to our QA environment and monitoring the logs.

private object RetryForCatchableExceptions(Func redisOperation) { try { return redisOperation.Invoke(); } catch (Exception e) { if (e.Message.Contains("NOSCRIPT")) { // Second call should pass if it was script not found issue return redisOperation.Invoke(); }

            // Handle this error: READONLY You can't write against a read only slave.
            if (e.Message.Contains("READONLY"))
            {
                LogUtility.LogError(e.Message);
                // reconnect to master
                redisMultiplexer = ConnectionMultiplexer.Connect(redisMultiplexer.Configuration);
                this.connection = redisMultiplexer.GetDatabase(configuration.DatabaseId);
                return redisOperation.Invoke();
            }
            throw;
        }
    }
wakm commented 8 years ago

Are you planing to merge your changes in git repository?

jeffz29 commented 8 years ago

I have not been able to reproduce this in our QA environment for the last week. I don't see the error in the logs. I don't feel comfortable merging this until I can confidently show that the fix works. Does the error happen regularly for you? It happened several times a month ago, but has not happened at all in the last 2 weeks.

wakm commented 8 years ago

I experience those errors regularly (nearly daily), and I still would like to know why this error occurs and if I have a configuration issue here. My thought about this error was that this occurs if a session needs longer to load than my defined timeouts (1000ms) in the session state provider.

wakm commented 8 years ago

@jeffz29 what timeouts (connectionTimeoutInMilliseconds, operationTimeoutInMilliseconds) and retryTimeoutInMilliseconds did you set in your configuration? After I set the retryTimeoutInMilliseconds I experience even more exceptions? Could you maybe help me in this case?

jeffz29 commented 8 years ago

We don't have any values for timeouts set in the config so it should be the defaults. When debugging, I do get a lot of the retry timeouts, but those are handled in the code for a few iterations. Now, even when I manually change the master, there are no more errors. I'm not sure what caused it in the first place but it is no longer showing up regularly.

SiddharthChatrolaMs commented 8 years ago

Red more about retryTimeoutInMilliseconds here(https://github.com/Azure/aspnet-redis-providers/wiki/Configuration). Read about connectionTimeoutInMilliseconds and operationTimeoutInMilliseconds are basically StackExchange.Redis "ConnectTimeout" and "SyncTimeout". Choose these values according to need of your application.

SiddharthChatrolaMs commented 8 years ago

If not getting these error with latest session state provider please close this issue.

mac2000 commented 8 years ago

I have also got similar error, after digging into problem it seems that problem is quite strange, from one side site cache works perfectly (so writes are going to master) but from other sessions are somehow loose track of current master. But it uses same stackexchange client library underneath, only difference I have found is outdated version, but it does not matter, after trying to reproduce locally from master sources of both libraries got same issue.

I have configured 3 redis servers and 3 sentinels on different machines I run site on same machine that is running master Site just writes something to session on page load After shuting down redis master - I start receiving errors about readonly slave

At moment only possible way to fix problem asap for us was to allow writes to slaves

mac2000 commented 8 years ago

@jeffz29, @SiddharthChatrolaMs - here is reproduce - https://youtu.be/NKs2u7SEJfM

unfortunatelly suggested fix did not work

Configs are:

Master machine

redis.conf

bind 127.0.0.1 192.168.5.51
port 6391
notify-keyspace-events "AKE"
maxmemory 1gb
maxmemory-policy allkeys-lru
save ""

sentinel.conf

bind 127.0.0.1 192.168.5.51
port 26391
sentinel monitor mycluster2 192.168.5.51 6391 1

Slave machines

redis.conf

bind 192.168.5.10
port 6391
notify-keyspace-events "AKE"
maxmemory 1gb
maxmemory-policy allkeys-lru
save ""
slaveof 192.168.5.51 6391

sentinel.conf

bind 192.168.5.10
port 26391
sentinel monitor mycluster 192.168.5.51 6391 2

Web site

Web.config

<sessionState mode="Custom" customProvider="MySessionStateStore">
  <providers>
    <add name="MySessionStateStore" type="Microsoft.Web.Redis.RedisSessionStateProvider" connectionString="192.168.5.51:6391,192.168.5.10:6391,192.168.5.7:6391" />
  </providers>
</sessionState>

Default.aspx.cs

Session["foo"] = DateTime.Now.ToString("HH:mm:ss.fff");
test.InnerText = (string)Session["foo"];

Hope that will help.

Notice, that redis itself works correctly, in demo I did run separate site with another implementation of redis session state and it works ok, if i will run third site which will use StackExchange.Redis and write something to it - it will also work as expected.

wakm commented 8 years ago

@SiddharthChatrolaMs we updated to the latest version and pushed it to production. I will observe in the following days if the errors occur again.

mac2000 commented 8 years ago

@wakm upgrading StackExchange.Redis dependency do not resolve issue unfortunately

wags1999 commented 7 years ago

For anyone else getting this error: I was seeing slightly different behavior than @jeffz29 . We were getting this error for a few seconds when the master node would go down, but before the sentinels were electing a new master.

The root cause was that the sentinels had a down-after-milliseconds set to 10000 (so it would take 10 seconds before the sentinels would elect a new master), but the default retryTimeoutInMilliseconds for this provider is 5000 (so the provider would stop retrying and throw an exception after 5 seconds). Updating the retryTimeoutInMilliseconds to be more than 10000 resolved the issue for us.

rickywang1981 commented 7 years ago

We also encountered this kind of issues from time to time. The down-after-milliseconds set to 3000 and retryTimeoutInMilliseconds set to 5000. Prolong retryTimeoutInMilliseconds can workaround it?

dssagar93 commented 3 years ago

For me, the redis-server wasn't started so I was getting this error. I started the Redis and it worked fine.