StackExchange / StackExchange.Redis

General purpose redis client
https://stackexchange.github.io/StackExchange.Redis/
Other
5.86k stars 1.51k forks source link

Using a WeakReference to remember LUA script cache looks like a cause of server script memory exhaustion? #2476

Open TimLovellSmith opened 1 year ago

TimLovellSmith commented 1 year ago

This is a kind of half-finished psychic-debugging style investigation so take my theory with a grain of salt but... I investigated a redis server crash where the LUA script space was exhausted, and found my way to the pages on LUA scripting with SE.redis (the library being used)

https://stackexchange.github.io/StackExchange.Redis/Scripting.html

and the source for LoadedLuaScript, which is stored with a weak reference.

In this scenario they keep using (reusing) a single LUA script, but they still eventually exhausted the server LUA scripting memory.

I hear (TBV) that they are calling ScriptEvaluate and it eventually becomes EVALSHA under the hood.

This would involve calling Prepare, which seems to use WeakReference for caching under the hood.

https://github.com/StackExchange/StackExchange.Redis/blob/ae6419a164600b4b54dd7ce8a699efe8e98d8f1c/src/StackExchange.Redis/LuaScript.cs#L91

Unfortunately, I think the problem in practice with storing loaded lua scripts with a weak reference is that those weak references WILL get garbage collected and so the prepared cached script is forgotten by the client -- leading to eventually reloading the LUA script in the server unnecessarily, right?

A possible solution here might be using something 'smarter' or 'more robust' than WeakReference to solve this same problem. For instance, I don't know, MemoryCache?

TimLovellSmith commented 1 year ago

Or at the least, this pit of failure could be better documented with warnings NOT to use ScriptEvaluate for scripts which will be evaluated repeatably, and to always use ScriptLoad to get a loaded script object reference, and store it with a GC root...

mgravell commented 1 year ago

ScriptEvaluate doesn't use that particular API, IIRC - it does things a different more direct way. Personally I would always recommend ScriptEvaluate and not the other approaches. On mobile at moment so I can't do a full dive, but: there may be some incorrect premises here.

mgravell commented 1 year ago

So that I understand, can we step back and clarify what happened? Is the following correct:

Is that correct? That seems odd. Yes, by default we do "load" scripts called via ScriptEvaluate[Async] (which would normally only be once per process, assuming multiplexer reuse), but it seems very odd that this would cause the server to fail if the same script is being loaded. If anything that sounds like a redis server bug!

Obviously if a different script is being issued each time (string concatenation) the situation is different; there is a flag to disable eager script load on the calling side, but redis will still do whatever redis does. If the multiplexer is not being reused, or if processes are very short lived, that may amplify the rate of "load" calls, but I still would not usually expect that to break a server.

TimLovellSmith commented 1 year ago

So changed the title - I've got new info from the customer and revised my understanding. I don't think the weakReference is probably that relevant, any more. The main issue that I see is they are not using the 'LoadedLuaScript' optimization at all. Which, I guess means that the script could be being reparsed from scratch, and eventually exhausting something on the server (probably a server bug? but with an obvious workaround!).

mgravell commented 1 year ago

See, LoadedLuaScript isn't an optimization - not really; the library already handles loading scripts. It shouldn't constantly be re-parsing, or constantly loading. Hence my wanting to understand exactly what was observed, rather than focusing on code. Again, what actually happened? Happy to help look, but I'd love to hear the symptoms and background, rather than leaping to diagnosis or treatment, because what is being proposed seems ... unexpected.

mgravell commented 1 year ago

@TimLovellSmith feel free to ping me on teams @ MSFT btw; if there are "non public GitHub discussion" things, it may be I can help more over there. I'm in UK timezone, though, so I won't really be around until Monday

pianoman4873 commented 7 months ago

Hello @mgravell Would writing the following code be OK ? ( I mean , would the client already handle the loading etc. behind the scenese ? )

    // Lua script
    string script = @"
        if redis.call('EXISTS', KEYS[1]) == 1 then
            return redis.call('EXPIRE', KEYS[1], ARGV[1])
        else
            return 0
        end";

    // Key and expiry time
    string key = "mykey";
    int expiryInSeconds = 60;

    // Run the Lua script
    RedisResult result = db.ScriptEvaluate(script, new RedisKey[] { key }, new RedisValue[] { expiryInSeconds });
mgravell commented 7 months ago

Yes that will automatically load and be cached by the library, you don't need to do anything more exotic. KeyExpire does return some state, though, IIRC (not at a PC to check)