DapperLib / Dapper

Dapper - a simple object mapper for .Net
https://www.learndapper.com/
Other
17.47k stars 3.67k forks source link

Memory leak using QuerySingle or QuerySingleOrDefault #1759

Closed git2013vb closed 5 months ago

git2013vb commented 2 years ago

Hi,

As stated in title I have a memory leak.

Here my code:

    public static T GetRowByCodeName<T>(string table, string codeName)
        {
            T t = default;
            string sql = SELECT_ALL_FROM + table + "s WHERE CodeName = @CodeName";
            using (var con = new SqliteConnection(DS))
            {
                t = con.QuerySingle<T>(sql, new { CodeName = codeName });
            }
            return t;
        }

I use Dapper Version 2.0.78

Thank you.

mgravell commented 2 years ago

What makes you think you have a memory leak? And what makes you think it may be related to Dapper?

git2013vb commented 2 years ago

Thank you for your reply,

Because I have a tool that tell me when I have e memory leak and it don't sow me problems when I comment the QuerySingle/QuerySingleOrDefault line.

I have other memory leak related to other places not related to Dapper and they all were confirmed too by community.

In this specific case I did't investigate in Dapper source code so are my assumptions of course but all my investigations point out to Dapper.

Thanks

mgravell commented 2 years ago

OK; you've given me exactly zero information to work with. I agree that the code shown looks reasonable; obviously there will be a few allocations here (on your side: the string with the table new, and the anonymous parameter object; and on the ADO.NET side: some internal objects etc) - but without some indication (usually from a memory profiler) of what objects you think are leaking (along with their root paths), there isn't much I can investigate here.

You haven't even told me what you're actually seeing - people use "leak" very badly, so the specifics matter; there's a huge difference between "there are some allocations, and my system isn't under memory pressure, so they haven't yet been collected", and "some objects are being incorrectly rooted and are surviving GC indefinitely, ending up in GEN2".

Only the second of those is a "leak", and the only real way to investigate that is with a memory profiler. I'm intimately familiar with the internals of Dapper, and while there are some things that would end up rooted, this is mostly "shape" related, by which I mean: yes, something might end up cached and retained per unique query (per table), but even that isn't truly permanent, and it isn't "per query", just "per shape". So; I'm interested, but without more context I don't know what I can do here.

My recommendation would be to use something like dotMemory against your process; run some load so you can see the increasing memory condition, then snapshot it (button in dotMemory); then look at the snapshot, and look at what is taking the memory - and in particular: are they ADO.NET/Dapper related, are they rooted, and if so, what are their roots. I can't do that without your entire setup, but it is very easy for you to do if you can simulate load locally.

git2013vb commented 2 years ago

I understand. :) Its interesting to me too.

I have to investigate more deeply. I work in Linux and I'm using vscode as editor. The tool that give me insight about memory leak is a game engine (Godot).

I will investigate about my Classes first then I will build an alternative QuerySingle to understand what and how.

I had a thought to take a look in Dapper code but I'm not sure if I have what it take to understand the code.

I will do some investigations first and Ill come back to you anyway.

Thank you again :).

mgravell commented 2 years ago

If you want to do it that way: you could replace the code with the equivalent ADO.NET code from first principles, i.e. create your command, add your parameters, execute reader, etc. If that shows a different profile: then there's something interesting happening.

On Tue, 15 Mar 2022, 10:51 git_vb, @.***> wrote:

I understand. :) Its interesting to me too.

I have to investigate more deeply. I work in Linux and I'm using vscode as editor. The tool that give me insight about memory leak is a game engine (Godot).

I will investigate about my Classes first then I will build an alternative QuerySingle to understand what and how.

I had a thought to take a look in Dapper code but I'm not sure if I have what it take to understand the code.

I will do some investigations first and Ill come back to you anyway.

Thank you again :).

— Reply to this email directly, view it on GitHub https://github.com/DapperLib/Dapper/issues/1759#issuecomment-1067839443, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEHMH3HQOYXRPU56VRHFTVABTS3ANCNFSM5QVPADGA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

Juandavi1 commented 6 months ago

Hi ! The same issue here !

mgravell commented 6 months ago

@Juandavi1 and just like the OP, you have given me zero to work with. What has driven you to this conclusion? What data are you looking at? What symptoms are you seeing? And what does your code look like? A concrete repro would be ideal, but anything would be a start.

Because without that, I'm going to say... "Squirrels... it is probably squirrels".

Juandavi1 commented 6 months ago

Hi @mgravell ! Thanks for answering.

We have a multi-tenant db model. Each user has a different connection string.

We have a base class that creates the connections and data accessors.

Something like this:

image

this is our invocation

image

this is our gcdump

image

Our memory graph looks like this

image

JonathanMagnan commented 6 months ago

Hello @Juandavi1 ,

Could you try without cache (flags: CommandFlags.NoCache) such as:

connection.QueryFirstOrDefaultAsync(new CommandDefinition(query, new { browserId}, flags: CommandFlags.NoCache, cancellationToken: cancellationToken));

I might be wrong, but this sentence, "Each user has a different connection string." lets me think that a different Identity is created for every one of your users, and this Identity depends on the connection string. See: https://github.com/DapperLib/Dapper/blob/402f20ce37e719a23b4f4a507391b65f12cd8657/Dapper/SqlMapper.Identity.cs#L137

Then, some information is indeed cached later using this Identity previously created (so cached multiple times as you use different connection strings): https://github.com/DapperLib/Dapper/blob/402f20ce37e719a23b4f4a507391b65f12cd8657/Dapper/SqlMapper.Async.cs#L481

People who re-use the same connection string don't have this issue, but if you keep using different connection strings, it could happen. I didn't test it, but by looking at the code, that could explain it.

If that fixes it, let me know. I would like to create documentation on Learn Dapper about this.

Also, if that fixes it, there might be a better way to fix it in your case by setting a new ConnectionStringComparer here if you could find out how to make all connection strings from different users belong to the same "string" : https://github.com/DapperLib/Dapper/blob/402f20ce37e719a23b4f4a507391b65f12cd8657/Dapper/SqlMapper.cs#L3981


Disclaimer: To avoid confusion, I'm a simple contributor to Dapper. However, I own the Learn Dapper and Dapper Plus.

mgravell commented 6 months ago

If this is related to connection-string explosion, this emphasizes my need to keep pushing on the AOT piece; the AOT work has no additional caches like this one - we have the luxury of considering more possibilities at build time. Depending on what your code looks like, you might even be able to try that right now (it isn't 100% feature parity with Dapper vanilla, but it is a high percentage, especially of the core features), and can be enabled as granularly or as globally as you choose: https://aot.dapperlib.dev/gettingstarted

Juandavi1 commented 6 months ago

Hello! Thank you for the information!

@JonathanMagnan, I experimented with the CommandFlags.NoCache flag, and noticed that memory usage increased as well.

@mgravell, I have one question: should I use Net8 within my library where I build the data accessor, or should I also use it in my microservices?

mgravell commented 6 months ago

@Juandavi1 if you're talking about Dapper.AOT there: only the build SDK needs to be .NET 8 (or higher); the compiled code can target your usual range of platforms. It is basically a compiler feature; it has no extra runtime requirements, and works on down-level runtimes.

Juandavi1 commented 6 months ago

Hi @mgravell !

We've compiled our library with .NET 8 and enabled Dapper.AOT. The code was generated successfully. We imported the library into our project, which is compiled in .NET 6. Unfortunately, the memory usage seems to increase as well.

Do you know how I can find out if the generated methods are being used?

image
Juandavi1 commented 6 months ago

@JonathanMagnan Do you know where I can find documentation related to using a custom IEqualityComparer?

mgravell commented 6 months ago

My apologies, I've just seen your gcdump. Unless I'm missing something, nothing in that image is Dapper. It looks like this is all just SqlClient overhead, presumably from the wide wide range of connection strings / identities. Have you disabled connection pooling in your connection string? (that is probably advisable in this scenario)

mgravell commented 6 months ago

Also, I wouldn't mind a look at your this.GetDbConnection method and related call chain, just to see if you're leaving anything attached there.

mgravell commented 6 months ago

I believe adding ,Pooling=false to the connection string may help.

Juandavi1 commented 6 months ago

I will try it ! Thanks a lot !

JonathanMagnan commented 6 months ago

Hello @Juandavi1 ,

I did some tests, and I was "right" and "wrong".

Indeed, Dapper will cache depending on the connection string. So, the memory will grow the more you have a different connection string. However, it takes almost no space. Using the NoCache fixes this part of the "memory leak", but this is likely not your issue as it will only save you a small portion of the memory used.


Let's take the following code that doesn't use Dapper:

for(int i = 0; i< 10000; i++)
{
    using (var connection = new SqlConnection("Server=localhost;Initial Catalog =LabDapperPlusCore;Integrated Security = true;Persist Security Info=True;Connection Timeout=" + i))
    //using (var connection = new SqlConnection("Server=localhost;Initial Catalog =LabDapperPlusCore;Integrated Security = true;Persist Security Info=True;"))
    {
        connection.Open();
        var command = connection.CreateCommand();
        command.CommandText = "SELECT * FROM EntitySimple WHERE ID = @ID";
        command.Parameters.AddWithValue("ID", i);
        command.ExecuteNonQuery();

    }
}

The connection string with the connection timeout is always unique (so similar to having a connection string per user) and will take up to 700MB in memory in my test and will continue to grow as we increase the maximum connection.

While the second connection (the one commented), which re-uses the same connection string, will take up to 20MB in memory.

The difference in the number of objects is huge when looking at the memory snapshot in Visual Studio.

Obviously, I just made a big loop in this test, so perhaps in a real-life scenario, this kind of issue cannot happen, but it can give you a hint if that's possible to test with the same exact connection string to see if that could be the cause or not.

I don't know how pooling is affected by different connection strings.

I'm not sure if that helped, but I thought it was better to write about what I discovered while trying to reproduce this issue.

mgravell commented 6 months ago

I don't know how pooling is affected by different connection strings.

It is complex; some parts of the connection string force it to use different pools; some parts do not. I genuinely don't know which bucket Connection Timeout falls into (I guess you could add ,Pooling=false to check, but: that isn't my intent here). Anything relating to the chosen server/database/default-schema or identity definitely force separate pools, so in the scenario originally described, pool explosion (as shown in the GC dump) will definitely be a problem if pooling is enabled (which it is by default).

Juandavi1 commented 5 months ago

Hi ! We found something. We are calling SqlConnection.ClearPool() after each query and the memory leak disappear. However the performance increase a lot !

I was wondering 🤔 Is there a way to have the same pool for diferente connections? Do you know? @mgravell @JonathanMagnan

Juandavi1 commented 5 months ago

If we set pooling to false, throws a too many noon-pooled connections exception

mgravell commented 5 months ago

I am sympathetic, but ultimately this isn't a Dapper issue - this is something in the underlying ADO.NET provider - in this case Microsoft.Data.SqlClient, and there is literally nothing that Dapper can do to impact that. By definition, a single pool is for interchangeable connections, so no: you can't have incompatible connections (different identity, schema, database, whatever) in the same pool. My suggestion would be to take this up over here. The only other alternative would be to basically roll your own pool, but that sounds fraught with problems.