dotnet / EntityFramework.Docs

Documentation for Entity Framework Core and Entity Framework 6
https://docs.microsoft.com/ef/
Creative Commons Attribution 4.0 International
1.63k stars 1.96k forks source link

Document the increased memory requirements associated with connection resiliency #2902

Closed dstj closed 3 years ago

dstj commented 3 years ago

I've just hunted down a tenfold memory increase after moving to EF Core 3.1.10 (from 2.2.6) and the cause was that the query was no longer automatically split into 5 simpler queries by EF Core, i.e. the single query / no automatic splitting feature of EF Core 3.

EF Core 2 - ~120 MB used big asset import - efcore2

EF Core 3 - ~1.3 GB used big asset import - efcore3

This has a huge impact.

FYI, the EF Core 2 query looked like:

var company = await _dbContext.Companies
                .Include(c => c.Sites)
                .Include(c => c.Categories)
                .Include(c => c.Assets).ThenInclude(a => a.Site)
                .Include(c => c.Assets).ThenInclude(a => a.Tasks)
                .AsTracking()
                .SingleOrDefaultAsync(c => c.Id == companyId);

Note: I'll try to create a small project to reproduce the issue a bit later.

-- Originally posted by @dstj in https://github.com/dotnet/efcore/issues/18022#issuecomment-733086888

roji commented 3 years ago

@dstj can you confirm that the split query feature re-introduced in the 5.0 release makes the issue disappear?

This seems like it could simply be a natural result of the cartesian explosion that single query can trigger... @smitpatel is it possible we materialize duplicated data? If not it could also be happening undernear in SqlClient or something.

In any case, a repro would be useful for further investigation.

dstj commented 3 years ago

@roji I'll try that as well and get back to you. I assume it will. ๐Ÿคž

It does look like a LOT of duplicated data is created in memory. Memory profiling showed me about 1.2 million copies of a class that should have been there like 20-25 times... I broke my request into two requests and memory usage went down to ~125 MB (similar to EF Core 2)

New version:

var company = await _dbContext.Companies
                      .Include(c => c.Sites)
                      .Include(c => c.Categories)
                      .SingleOrDefaultAsync(c => c.Id == companyId);

var assets = await _dbContext.Assets
                      .Include(a => a.Tasks)
                      .AsTracking()
                      .Where(c => c.CompanyId == companyId)
                      .ToListAsync();

Obviously, this 1.3 GB used is dependent on the actual amount the data fetched from the DB, but clearly, there's a problem with the underlying single-query logic. At the very least, an Error or Warning should make its way in the logs.

smitpatel commented 3 years ago

1.2 million copies of a class that should have been there like 20-25 times...

It may be because of identity resolution rather than split query https://docs.microsoft.com/en-us/ef/core/what-is-new/ef-core-3.x/breaking-changes#notrackingresolution

Try using AsNoTrackingWithIdentityResolution() rather than AsNoTracking() in 5.0 so see effect of it.

dstj commented 3 years ago

@smitpatel I thought that as well, but the original query (the problematic one) is AsTracking(), so I understand the identity resolution is performed.

smitpatel commented 3 years ago

I misread that as AsNoTracking. My bad.

smitpatel commented 3 years ago

In that case, there should not be more than one copy of each entity with given key. Further even if single query brings a lot more data from server, duplicated data is not materialized again. Or even read from the DataReader when not needed. So it increasing client side memory is sus.

dstj commented 3 years ago

I misread that as AsNoTracking. My bad.

No worries, I should have mentioned that I do QueryTrackingBehavior.NoTracking by default in my global options. I think it's safer that way (and now looking forward to the AsNoTrackingWithIdentityResolution option in EF Core 5)

I'll try to get a small repro done tonight. Hopefully, it won't be too hard to do and will help pinpointing the root cause. ๐Ÿคž

dstj commented 3 years ago

@roji Result of the first test: EF Core 5 with AsSplitQuery() is good. ๐Ÿ‘

big asset import - efcore5 - query splitting

Without it, same huge memory usage than EF Core 3, but with this warning in the logs:

Compiling a query which loads related collections for more than one collection navigation either via 'Include'
or through projection but no 'QuerySplittingBehavior' has been configured. By default Entity Framework will use 'QuerySplittingBehavior.SingleQuery'
which can potentially result in slow query performance.
See https://go.microsoft.com/fwlink/?linkid=2134277 for more information. To identify the query that's
triggering this warning call 'ConfigureWarnings(w => w.Throw(RelationalEventId.MultipleCollectionIncludeWarning))

(Now, onto the small repro attempt...)

dstj commented 3 years ago

@roji @smitpatel Alright, some more info, not yet a repro that I can share, but a good step in the right direction. It seems related to the cfg.EnableRetryOnFailure() options of the UseNpgsql statement I use.... so maybe the issue is in the wrong github repo, or maybe not because NpgsqlRetryingExecutionStrategy seems pretty basic and inherits from ExecutionStrategy ๐Ÿคจ

Here's my test case:

/* Test description:
 see https://github.com/dotnet/efcore/issues/23461
 */
public class Tests
{
    public const string ConnectionString = "Server=127.0.0.1;Port=5432;Database=pym;User Id=postgres;Password=postgres";

    [Test]
    public async Task DoTheTest()
    {
        var companyId = Guid.Parse("94d6c7e8-15f8-411a-a8bd-2e8dfb0da8ab");

        var retryOptions = new DbContextOptionsBuilder<MyDbContext>()
            .UseNpgsql(ConnectionString, cfg => { cfg.EnableRetryOnFailure(); });

        var noRetryOptions = new DbContextOptionsBuilder<MyDbContext>()
            .UseNpgsql(ConnectionString);

        await DoRequestAsync(noRetryOptions.Options, "NO RETRY 1", companyId);

        await DoRequestAsync(retryOptions.Options, "W/ RETRY", companyId);
    }

    private static async Task DoRequestAsync(DbContextOptions<MyDbContext> dbContextOptions, string method, Guid companyId)
    {
        using (var context = new MyDbContext(dbContextOptions)) {
            Console.WriteLine("{0} - BEFORE REQUEST - Memory Used {1:F2} MB", method, MemorySetHelper.GetPrivateMemoryInMb());

            var company = await context.Companies
                        .Include(c => c.Sites)
                        .Include(c => c.Categories)
                        .Include(c => c.Assets).ThenInclude(a => a.Site)
                        .Include(c => c.Assets).ThenInclude(a => a.Tasks)
                        .AsTracking()
                        .SingleOrDefaultAsync(c => c.Id == companyId);

            Console.WriteLine("{0} - AFTER REQUEST - Memory Used {1:F2} MB", method, MemorySetHelper.GetPrivateMemoryInMb());
        }
    }
}

And the results:

NO RETRY 1 - BEFORE REQUEST - Memory Used 41.09 MB
NO RETRY 1 - AFTER REQUEST - Memory Used 65.88 MB
W/ RETRY - BEFORE REQUEST - Memory Used 65.89 MB
W/ RETRY - AFTER REQUEST - Memory Used 1383.33 MB
roji commented 3 years ago

@dstj when a retrying execution strategy is used, EF Core buffers the entire resultset in memory before starting to process any results (this way, if any error happens during the loading of the resultset, the query can be safely retried). If the resultset is huge, this leads to increased memory usage, although that memory is released for garbage collection as soon as the query is finished. This isn't really related to either split/single query, or tracking/no-tracking.

dstj commented 3 years ago

@roji Hmm, so you're saying this 65MB to 1.3GB (about 2100%) memory usage is, essentially, a side-effect of a feature?

I had no idea. I had read this doc page about Connection Resiliency and simply wanted any transient network error to be retried (during connection mostly). There's no mention of the buffering of the resultset on that page, nor any possible immensely negative impact on memory consumption. I know I certainly didn't expect it.

While not really related, the "Single query" breaking change in EF Core 3 certainly shoved that one into the light for me though ;)

roji commented 3 years ago

Hmm, so you're saying this 65MB to 1.3GB (about 2100%) memory usage is, essentially, a side-effect of a feature?

Yes, one could think of it that way - and I agree that the buffering aspect of retrying strategies should be documented better. Note when using split queries, the results from the earlier queries must also be buffered (regardless of retrying strategies or not), unless MARS is used with SQL Server (while MARS comes with its own set of problems). At least that part is documented.

However, stepping back... While it's true that connection resiliency increases memory usage, it's important to understand that even without connection resiliency, that 1.3GB is being transferred across the network. That is, without resiliency there's memory impact (since EF does streaming instead of buffering), but there's very much a problematic perf phenomenon, which you likely want to avoid. The root cause of having a 1.3GB resultset is the cartesian explosion - since split queries make it disappear - and so that's the main point here, not the resiliency buffering effect.

To summarize, avoid transferring huge resultsets as a general rule; although these have an adverse effect on memory only when resiliency is activated, they are problematic perf-wise for other reasons as well.

PS If what you're after is only connection resiliency and are using SqlClient, you can enable that at the SqlClient level, without EF core doing anything (see the SqlClient docs).

dstj commented 3 years ago

Thanks @roji for your thorough explanation. Much appreciated. Good point about the network transfer!

Yes, the cartesian product is definitely bad and should be avoided. Automatic query splitting hid it (avoided it) from me in EF Core 2.

That makes me think, maybe a feature suggestion here: just as we had a setting to throw on Client-side evaluations in EF Core 2, there could be a setting to throw on cartesian products detection unless a AsCartesianProduct() flag is set on the query. I would much much rather have had an exception early in my migration tests then see my server exploding when I tried moving it to prod.

Thanks again.

Update I see that suggestion already exists in EF Core 5

ConfigureWarnings(w => w.Throw(RelationalEventId.MultipleCollectionIncludeWarning))

I should have just skipped EF Core 3 ๐Ÿ˜‰

roji commented 3 years ago

Yep, we warn on potentially bad cartesian explosion in 5.0, and give you the option to throw as well.

I've transferred this to our docs repo to add proper documentation that resiliency causes buffering.