dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.65k stars 3.15k forks source link

Strange memory behaviour with large number of columns #34527

Open Steve887 opened 2 weeks ago

Steve887 commented 2 weeks ago

I have run into a strange issue with our app that results in strange memory usage: spiky memory loads and increasing memory usage over time, when there is a large number of properties in the model being selected. What's particularly strange, if I comment out just one property, the memory issues no longer occur and memory usage is extremely consistent.

A requirement of our system is to change connection strings at runtime, so instead of using Services.AddDbContext, I am registering the data context with Autofac and passing in a connection string at runtime, then using an overridden OnConfiguring to setup the SqlServer provider. If I change this to use Services.AddDbContext then memory does return the normal, but I'm confused why this would only seem to have an effect at a large number of properties.

The rest of my setup is very normal, with a new entities being selected with Includes.

I have attached an app that reproduces the issue, the steps are as follows:

  1. Open the memory test solution
  2. Start the MemoryTest.Api project. This will create a database on a (localdb)\MSSQLLocalDB database, so change this for SqlExpress or other server etc in Program.cs.
  3. Start memory profiling with your favourite program (I used dotMemory)
  4. Run the StressTestApi.ps1 file. This will execute an API call against the endpoint constantly to simulate a load.

Running the application as is, results in a memory graph as follows: image

Open Item.cs and comment out the ItemImage property. Then open ImageMap.cs and comment out the ItemImage property mapping. Start the application again, attach the memory profiler and rerun the powershell script. This results in the following memory graph: image

The big differences seem to be in the gen 1 and 2 heaps, although taking memory snapshots doesn't really reveal anything obvious. The total memory also grows over time when the column is there.

While it's easy enough to say, just decrease the model size, I am working with a large, legacy, model and cannot make big changes like that. I would also like the know the underlying reason why the memory behaviour changes so drastically just by changing one column. I would expect if the setup is wrong for it to happen all the time.

This also occurs in .Net 7 and EFCore 7 versions, on windows and linux.

Please let me know if there's any more information to supply.

MemoryTest.zip

Include provider and version information

EF Core version: 8.0.8 Database provider: Microsoft.EntityFrameworkCore.SqlServer Target framework: .Net 8 Operating system: Windows

roji commented 2 weeks ago

Before taking a look at your repro, it's well-known that SqlClient has some severe memory/performance issues with reading large binary columns asynchronously (https://github.com/dotnet/SqlClient/issues/593), so that could explain the memory behavior when adding your "image" property. This should be easily verifiable by switching to sync I/O (SaveChanges() instead of SaveChangesAsync()) as a test - can you please do that?

Steve887 commented 2 weeks ago

Before taking a look at your repro, it's well-known that SqlClient has some severe memory/performance issues with reading large binary columns asynchronously (dotnet/SqlClient#593), so that could explain the memory behavior when adding your "image" property. This should be easily verifiable by switching to sync I/O (SaveChanges() instead of SaveChangesAsync()) as a test - can you please do that?

It's not a binary column, it's just a 50 character string. It also doesn't have any data in it. image

ajcvickers commented 2 weeks ago

@Steve887 Do you see the same behavior if you change the query to be no-tracking? For example:

return await _context.Set<VisitView>()
    .Include(x => x.ConsultationViews).ThenInclude(x => x.ConsultationItems).ThenInclude(x => x.Item)
    .AsNoTracking()
    .FirstOrDefaultAsync(p => p.VisitNumber == key);
Steve887 commented 2 weeks ago

@ajcvickers Hi, tried adding this and memory behaviour is similar. In fact total memory is actually higher. image

Steve887 commented 1 week ago

@ajcvickers is there any more information I can provide for this one?

satviktechie1986 commented 2 days ago

try using splitquery , might works

return await _context.Set() .Include(x => x.ConsultationViews).ThenInclude(x => x.ConsultationItems).ThenInclude(x => x.Item) .AsSplitQuery() .FirstOrDefaultAsync(p => p.VisitNumber == key);

Steve887 commented 1 day ago

@satviktechie1986 Setting AsSplitQuery does result in normal memory usage. However, in our actual app this isn't really a good solution as we have many queries across the app, and we don't want the dramatic increase in network calls enabling this setting globally would result in.

I would still like to find out why the original query has such a dramatic difference by simply including one extra property, so I can take those findings and implement in our main application.

satviktechie1986 commented 1 day ago

alternative you can use

return await _context.Set() .AsNoTracking() .Where(p => p.VisitNumber == key) .Select(v => new { v.VisitNumber, ConsultationViews = v.ConsultationViews.Select(cv => new { cv.Id, ConsultationItems = cv.ConsultationItems.Select(ci => new { ci.Id, ci.Item.Name }) }) }) .FirstOrDefaultAsync();

Steve887 commented 1 day ago

@satviktechie1986 again, this wouldn't work for our actual app as we have too many queries to realistically select just the required columns