gregoriusxu / booksleeve

Automatically exported from code.google.com/p/booksleeve
Other
0 stars 0 forks source link

System.OutOfMemoryException when creating lots of connections #32

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
Unknown - the problem seems to happen randomly

What is the expected output? What do you see instead?
I'm attempting to open a connection periodically for the purposes of a 
shortlived subscription. After opening the connection I dispose it. Disposal is 
guaranteed, so there's no connection left lying around. Sometimes, my first 
attempt to open a connection results in an OutOfMemory Exception. At the time 
that this happens, the application has only been maintaining one open Redis 
connection for a number of hours (anywhere from 6-24 hours). The application is 
using less than 100mb of memory and there is *plenty* of spare memory on the 
system when the exception is thrown. I have checked over my application 
thoroughly and there are no undisposed resources lying around not being cleaned 
up.

System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' 
was thrown.
   at System.Threading.Thread.StartInternal(IPrincipal principal, StackCrawlMark& stackMark)
   at System.Threading.Thread.Start(StackCrawlMark& stackMark)
   at System.Threading.Thread.Start()
   at BookSleeve.RedisConnectionBase.Open() in c:\Dev\BookSleeve\BookSleeve\RedisConnectionBase.cs:line 231
   at Systemizer.Net.RedisConnectionGateway.CreateConnection() in c:\build-system\systemizer\source\Libraries\Systemizer\Net\RedisConnectionGateway.cs:line 69

What version of the product are you using? On what operating system?
The BookSleeve assembly version is 1.2.0.5.

Original issue reported on code.google.com by axef...@gmail.com on 23 Jan 2013 at 5:23

GoogleCodeExporter commented 8 years ago
That is very odd, and doesn't match anything we've seen (we use it constantly 
in very similar circumstances). I don't know quite what I can do to 
repro/investigate that, since what you are doing is exactly what we are doing. 
And I've *never* seen an OOM from BookSleeve in that way. How do you propose I 
proceed?

Tangentially, the 1.3 branch attempts to do away with the dedicated thread, but 
that is not production-ready yet (it passes all aggressive tests locally, but 
it didn't work well in production; needs some love)

Original comment by marc.gravell on 23 Jan 2013 at 8:54

GoogleCodeExporter commented 8 years ago
To be honest I wish I could give you more to go on. It is certainly possible 
that the error is my side, though I've checked every last thing I can think of. 
I have a feeling that the error only occurs after an initial failed connection, 
as the Redis host I'm using occasionally fails the first connection attempt. Is 
there any possibility that a failed connection takes BookSleeve down an 
alternate path? If not, feel free to close the issue and I'll try to approach 
the problem in a way that avoids this scenario.

Original comment by axef...@gmail.com on 23 Jan 2013 at 9:10

GoogleCodeExporter commented 8 years ago
I have various tests against failed connections etc, and things basically clean 
up correctly in every test I can think of. I'm not saying "there's no problem 
here", I'm just saying: I can't repro it. Which makes it a real pain to fix.

Original comment by marc.gravell on 23 Jan 2013 at 9:45

GoogleCodeExporter commented 8 years ago
Ok, well thanks for discussing the issue. I've reworked my code to try and 
eliminate the need to open new connections periodically. If the problem is 
elsewhere, hopefully it will trigger in a different way now and give me more of 
an idea what the problem might be. Thanks anyway!

Original comment by axef...@gmail.com on 23 Jan 2013 at 10:10

GoogleCodeExporter commented 8 years ago
Still having problems. Here's my code. This gets called every 30 seconds or so 
(I'm using a 30 second timeout so it doesn't block for too long).

public static async Task<ObjectId?> DequeueSessionAndSetActive(int 
timeoutSeconds)
{
    using (var conn = Current.CreateConnection())
    {
        var count = await conn.Lists.GetLength(Db, "sessions:pending");
        var bytes = await conn.Lists.BlockingRemoveLast(Db, new[] { "sessions:pending" }, timeoutSeconds);
        if (bytes == null)
            return null;
        await conn.Sets.Add(Db, "sessions:active", bytes.Item2);
        return new ObjectId(bytes.Item2);
    }
}

public RedisConnection CreateConnection()
{
    var attempts = 120;
    RedisConnection conn = null;
    while (attempts > 0)
    try
    {
        conn = new RedisConnection(Host, Port, password:Password);
        conn.Open();
        return conn;
    }
    catch(Exception ex)
    {
        if (conn != null)
        {
            if(conn.State != RedisConnectionBase.ConnectionState.Closed)
                conn.Close(true);
            conn.Dispose();
            conn = null;
        }
        _log.WarnException(string.Format("Failed to open Redis connection for {0}, port {1}, {2}. Trying again shortly. Error: {3}",
            Host, Port, Password == null ? "No password" : "Password length: " + Password.Length, ex.Message), ex);
        var ev = new AutoResetEvent(false);
        ev.WaitOne(1000);
        attempts--;
    }
    throw new Exception("Unable to open redis connection (120 attempts made)");
}

So basically in my code it ends up failing 120 times due to an 
OutOfMemoryException, then it throws the full exception which causes the 
application to crash.

Original comment by axef...@gmail.com on 25 Jan 2013 at 9:55

GoogleCodeExporter commented 8 years ago
k; I'll take a look at that and see what I can do, but the fundamental problem 
here is that it *isn't intended* that you constantly spin up connections. 
BookSleeve is a multiplexer; the idea is that you sit a connection somewhere 
are aggressively share it (a single connection) between multiple callers. At 
Stack Exchange (Stack Overflow etc), we only need a single connection per 
AppDomain for *all* the concurrent redis traffic (although you could of course 
use a small pool of connections if you think it helps). Additionally, there are 
negotiation overheads when creating a connection that will (taken cumulatively) 
hurt performance.

So: I strongly suggest: share and re-use the connection. It is thread-safe, so 
you don't need to synchronize access or anything. Just fire work at it from 
every side.

I do need to prioritize porting our "robust connection" (i.e. one that handles 
re-connect if there's a problem) down from the Stack Exchange code into 
BookSleeve - that's a bit tricky at the moment as the implementation uses a 
good number of SE objects.

Original comment by marc.gravell on 25 Jan 2013 at 10:17

GoogleCodeExporter commented 8 years ago

Original comment by marc.gravell on 25 Jan 2013 at 10:18

GoogleCodeExporter commented 8 years ago
Can I share the connection even between regular operations and subscriptions? 
My understanding was that Redis puts a connection into a special mode during a 
subscription and only certain operations can be executed against the connection 
until the subscription ends. Shall I assume then that BookSleeve abstracts that 
away from me? If so, I am very happy to change my code to simply use one global 
connection!

Original comment by axef...@gmail.com on 25 Jan 2013 at 10:41

GoogleCodeExporter commented 8 years ago
Subscriptions are indeed on a different object - a RedisSubscriberConnection 
IIRC. If you have a RedisConnection instance, there is a GetOpenSubscriber 
method (or something like that), which will hand back a second connection to 
use for subscriptions.

Original comment by marc.gravell on 25 Jan 2013 at 10:47

GoogleCodeExporter commented 8 years ago
Ok cool, that makes sense. Might be worth updating the documentation to cover 
that in a bit more detail.

Original comment by axef...@gmail.com on 25 Jan 2013 at 11:30

GoogleCodeExporter commented 8 years ago
This should be fixed in 1.3, since 1.3 does not use per-instance threading. 
However! This does not change the fact that the connections are multiplexers, 
and you **don't need many**. The 1.3 threadless changes are actually intended 
to support redis-cluster, which means a single client may be talking to 
multiple redis instances (usually a handful, but in the worst case scenario: 
16384)

Original comment by marc.gravell on 26 Apr 2013 at 11:14