dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.68k stars 3.17k forks source link

Memory Leak on DbContext.SaveChanges() with Invalid Data #24663

Closed gilmorem560 closed 1 year ago

gilmorem560 commented 3 years ago

I've recently updated a number of APIs from .NET Core 2.2 to .NET 5.0 and am now experiencing some issues when saving changes back to the database context. I've found consistently that if there is something invalid in the data I am trying to insert (e.g. a string too long for the VARCHAR it is going into, a duplicate PK being inserted, etc.), rather than some error being percolated up to actually make this apparent, when calling SaveChanges on the DbContext, the RAM usage starts to spike. I've only been able to figure out which values are invalid by pulling down all the debugging symbols and inspecting down into the MS code from my own.

This is a stack trace down into EF Core:

Microsoft.EntityFrameworkCore.DbUpdateException
  HResult=0x80131500
  Message=An error occurred while updating the entries. See the inner exception for details.
  Source=Microsoft.EntityFrameworkCore.Relational
  StackTrace:
   at Microsoft.EntityFrameworkCore.Update.ReaderModificationCommandBatch.Execute(IRelationalConnection connection) in /_/src/EFCore.Relational/Update/ReaderModificationCommandBatch.cs:line 257

  This exception was originally thrown at this call stack:
    Microsoft.Data.SqlClient.SqlConnection.OnError(Microsoft.Data.SqlClient.SqlException, bool, System.Action<System.Action>)
    Microsoft.Data.SqlClient.SqlInternalConnection.OnError(Microsoft.Data.SqlClient.SqlException, bool, System.Action<System.Action>)
    Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(Microsoft.Data.SqlClient.TdsParserStateObject, bool, bool)
    Microsoft.Data.SqlClient.TdsParser.TryRun(Microsoft.Data.SqlClient.RunBehavior, Microsoft.Data.SqlClient.SqlCommand, Microsoft.Data.SqlClient.SqlDataReader, Microsoft.Data.SqlClient.BulkCopySimpleResultSet, Microsoft.Data.SqlClient.TdsParserStateObject, out bool)
    Microsoft.Data.SqlClient.SqlDataReader.TryConsumeMetaData()
    Microsoft.Data.SqlClient.SqlDataReader.MetaData.get()
    Microsoft.Data.SqlClient.SqlCommand.FinishExecuteReader(Microsoft.Data.SqlClient.SqlDataReader, Microsoft.Data.SqlClient.RunBehavior, string, bool, bool, bool)
    Microsoft.Data.SqlClient.SqlCommand.RunExecuteReaderTds(System.Data.CommandBehavior, Microsoft.Data.SqlClient.RunBehavior, bool, bool, int, out System.Threading.Tasks.Task, bool, bool, Microsoft.Data.SqlClient.SqlDataReader, bool)
    Microsoft.Data.SqlClient.SqlCommand.RunExecuteReader(System.Data.CommandBehavior, Microsoft.Data.SqlClient.RunBehavior, bool, System.Threading.Tasks.TaskCompletionSource<object>, int, out System.Threading.Tasks.Task, out bool, bool, bool, string)
    Microsoft.Data.SqlClient.SqlCommand.RunExecuteReader(System.Data.CommandBehavior, Microsoft.Data.SqlClient.RunBehavior, bool, string)
    ...
    [Call Stack Truncated]

Inner Exception 1:
SqlException: Violation of PRIMARY KEY constraint 'PK_FieldA'. Cannot insert duplicate key in object 'dbo.ObjectA'. The duplicate key value is (00000000-0000-0000-0000-000000000000).
The statement has been terminated.

If a deeper stack trace would help, I can try and run that down.

Below is the gist of what we're doing. This should be relatively equivalent to what we are actually doing on these objects, but extracting a test case would be a bit difficult at this point. If this isn't enough, I can try and provide more details, but I don't know if I can create a 1:1.

public partial class ObjectA {
    Guid? FieldA { get; set; }
}

public interface IObjectARepository {
    void Add(ObjectA entity);
}

public class ObjectARepository : IObjectARepository {
    private DbContext _context;
    private IConfiguration _config;

    public ObjectARepository(DbContext context, IConfiguration config)
    {
        _context = context ?? throw new ArgumentNullException("context");
        _config = config ?? throw new ArgumentNullException("config");
    }

    public void Add(ObjectA entity)
    {
        _context.Set<ObjectA>().Add(entity);
        _context.SaveChanges(); // This is where RAM starts maxing out
    }
}

public class ObjectAModel
{
    Guid? FieldA { get; set; }
}

public class ObjectAService
{
    private readonly IMapper _mapper;
    private readonly IObjectARepository _objectARepository;

    public ObjectAService(IMapper mapper, IObjectARepository objectARepository)
    {
        _mapper = mapper ?? throw new ArgumentNullException("mapper");
        _objectARepository = objectARepository ?? throw new ArgumentNullException("objectARepository");
    }

    public void Add(ObjectAModel model)
    {
        // Use AutoMapper to map ObjectAModel to an ObjectA
        // ObjectA entity = _mapper.Map(model);
        // Oversimplifying here, but just to indicate how we get our entity
        // At this point, entity.FieldA is either null or an empty guid.
        _objectARepository.Add(entity);
    }
}

We are working data-first, so no component model annotations on the entities. This is the gist of the relevant data

CREATE TABLE [dbo].[ObjectA](
    [FieldA] [uniqueidentifier] NOT NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

ALTER TABLE [dbo].[ObjectA] ADD  CONSTRAINT [PK_ObjectA] PRIMARY KEY NONCLUSTERED 
(
    [FieldA] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]

ALTER TABLE [dbo].[ObjectA] ADD  DEFAULT (newid()) FOR [FieldA]

If I work past the exceptions that happen inside of EF Core, then the system just runs away consuming RAM until whatever system the process is running on decides to just kill it.

So just in review, the behavior that I've seen is I retrieve or create an instance of an Entity, provide data to one of the fields that is not compliant with the constraints in the schema, and then calls to DbContext.SaveChanges() hang with massive RAM consumption before crashing whatever the application is.

Debugging without "Just My Code" allows me to inspect down into EF Core and see that there is indeed either a PK violation, overflow of a field, or whatever else the actual database error is, but somewhere in the midst of things this error is not caught and percolated up to the user, but instead disappears off into some ether somewhere and instead EF Core opts to just start gobbling up RAM on the host machine.

Also, for this specific case (as opposed to the same issues we've seen with inserting strings beyond the VARCHAR size), I've noticed that the specific error is that we are trying to insert a duplicate key, the empty GUID. Our DB schema up there has a default of NEWID() for FieldA. I'm not entirely sure why we aren't just letting the database default the field here. This may be completely unrelated, but shouldn't EF be supplying Guid.NewGuid() in accordance with this constraint? Or does the responsibility fall to us in code to explicitly provide an ID despite the fact the database already knows one is needed? Not a huge deal, and I could split this out to a separate issue, just in case anyone has some input or it may actually be directly related.

Finally, just some details that may not be apparent from the above, we're using dependency injection and a repository pattern for our data layer, so we inject in the repository into the service, and actually that service would implement an interface that is then injected at the next stage of the process, but I left the rest of the stack above that out, not sure how relevant it is after we get out of the service call. For the DbContext injection, the only option we're setting is EnableRetryOnFailure. I tried turning this off, thinking perhaps this was what was omitting the failures and just retrying over and over again, but it didn't seem to make a difference. I'm going to try to get a memory snapshot to determine what most of the RAM consumption is in terms of, but last I did that I'm pretty sure it was some sort of cache object down in EF Core.

Pardon anything missing, I typed up all of the code pieces from memory. Hopefully there's something simple here, but attempts to find other guidance on this issue or to even find other reports of something similar have been relatively fruitless. I did see one article on a separate page about expression trees and lambdas, but it didn't seem relevant to how we're handling our objects in any way.

In short, this would be totally workable if when I call SaveChanges up there, EF percolates up the exception that the primary key constraint on dbo.ObjectA is not met because FieldA is a zero GUID, but instead I only get the memory leak/RAM spike and then a crash unless I inspect down into EF Core. If there is any way to make EF Core stop and report DB errors instead of sequestering them and consuming RAM until a crash, that would be ideal. Thanks for taking the time to read, I know it's a bit much.

EF Core version: 5.0.5 Database provider: Microsoft.EntityFrameworkCore.SqlServer Target framework: NET 5.0 Operating system: Windows 10+IIS Express 10 (dev), Windows Server 2012 R2 + IIS 8.5 (prod) IDE: Visual Studio 2019 16.9.3

roji commented 3 years ago

@gilmorem560 for the memory leak situation, it sounds like this happens pretty consistently - can you please put together a minimal code sample (i.e. small console program) which makes this happen? Any database error (e.g. unique constraint violation) should indeed bubble up to the user.

Re GUID keys and how they're generated... By default, when a GUID property is configured as a key (either explicitly or implicitly), EF Core automatically sets up client generation for those. That means that if you don't set a key property (and so it defaults to the default/zero GUID), EF Core generates a GUID itself (this has nothing to do with your NEWID() default value in the database). Again, if you think you're seeing a problem here, a minimal code sample would be the best way to move forward with it - but please open a separate issue as this doesn't seem to be related to the memory leak.

gilmorem560 commented 3 years ago

Let me see what I can do. I definitely can't provide the actual application, but may be able to generate a complete test scenario. I'm thinking a console app + a SQL script to create the DB object first. Since we're going data-first, I want to ensure this aspect is part of the test.

ajcvickers commented 3 years ago

@gilmorem560 If you are using Serilog, then this could be a duplicate of #15214. If this is the case, then you'll need to use Serilog.Exceptions.EntityFrameworkCore.

gilmorem560 commented 3 years ago

That may absolutely be it honestly. I just dropped Serilog from one of the APIs in question, and I get the exception I expected on SaveChanges. Going to do a bit more testing today but I think you nailed it with that one.

I think where we'll land is we're just going to move from Serilog to the Microsoft logging middleware, I've already been eschewing Serilog for MS middleware in new components anywho, but have been rolling some very stale projects forward, and Serilog isn't part of our current "new work" anyway. I'm hesitant to say we're out of the woods without a test in-environment, but I'll be doing that shortly and will follow up.

gilmorem560 commented 3 years ago

And that was precisely it. I can't express how thankful I am for the feedback, I don't think I would've ever figured that out on my own, or at least would've been hopping between catches until I figured out what was eating the exception. I'm sure an explanation for the GUID error I was experiencing contemporaneously will be easier to run down now.

That Serilog error was making proceeding to any other point in troubleshooting virtually impossible for weeks. You just ended a lot of headaches, so thank you very much, I look forward to continuing to grow with .NET!

srini85 commented 1 year ago

If it helps anyone, this is the same problem with NLog aswell. I had some json serializing of the log messages by default and it traversed through all the DBContext properties. Fix was to change the Nlog Configuration to not do json serializing on the error messages. Notice the details attribute is not there in the logconsoleErr

<?xml version="1.0" encoding="utf-8" ?>
<!-- XSD manual extracted from package NLog.Schema: https://www.nuget.org/packages/NLog.Schema-->
<nlog xmlns="http://www.nlog-project.org/schemas/NLog.xsd" xsi:schemaLocation="NLog NLog.xsd"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      autoReload="true"
      internalLogFile="c:\temp\console-example-internal.log"
      internalLogLevel="Info" >

  <!-- the targets to write to -->
  <targets>
      <target xsi:type="Console" name="logconsole">
        <layout type="JsonLayout">
        <attribute name="time" layout="${longdate}" />
        <attribute name="level" layout="${level}" />
        <attribute name="msg" layout="${message}" />
        <attribute name="err" layout="${exception:format=tostring}" />
        <attribute name="details" encode="false" >
          <layout type='JsonLayout' includeAllProperties="true"  maxRecursionLimit="5"/>
        </attribute>
      </layout>
    </target>
    <target xsi:type="Console" name="logconsoleErr">
        <layout type="JsonLayout">
      <attribute name="time" layout="${longdate}" />
      <attribute name="level" layout="${level}" />
      <attribute name="msg" layout="${message}" />
      <attribute name="err" layout="${exception:format=tostring}" />
        </layout>
    </target>
  </targets>

  <!-- rules to map from logger name to target -->
  <rules>
      <logger name="*" minLevel="info" maxlevel="warning" writeTo="logconsole"/>
      <logger name="*" minLevel="error" maxlevel="fatal" writeTo="logconsoleErr"/>
  </rules>
</nlog>
ArminShoeibi commented 6 months ago

When handling exceptions thrown by SaveChangesAsync() in EF Core 8.0.2 , particularly when encountering unique index violations on the CouponCode column (the below code), there's a significant spike in memory consumption, reaching up to approximately 20GB.

Steps to Reproduce:

Execute a scenario where SaveChangesAsync() is called after AddRange method, attempting to add entities to the database.

Actual Behavior:

Upon encountering an exception during SaveChangesAsync(), especially related to a unique index constraint violation, the memory consumption escalates significantly, reaching around 20GB.

    private async Task<List<MiniDiscountDto>> AddRangeAsync(Entities.Discount discount, DateTime now, string[] chunkedCoupons)
    {
        try
        {
            await using (var scope = serviceScopeFactory.CreateAsyncScope())
            {

                var db = scope.ServiceProvider.GetRequiredService<OkecDbContext>();
                foreach (var coupon in chunkedCoupons)
                {
                    db.Discounts.Add(new Entities.Discount(discount, now, 0, couponCode: coupon));
                }
                await db.SaveChangesAsync();
                db.ChangeTracker.Clear();
                //await db.DisposeAsync();
                return chunkedCoupons.Select(c => new MiniDiscountDto { CouponCode = c }).ToList();
            }
        }
        catch
        {
            return [];
        }
    }
ArminShoeibi commented 6 months ago

Oh I got it, Serilog is the main source!

gilmorem560 commented 6 months ago

If it helps, we have since re-integrated Serilog and no longer run into issues such as these, our specific packages are:

    <PackageReference Include="Serilog" Version="[redacted]" />
    <PackageReference Include="Serilog.AspNetCore" Version="[redacted]" />
    <PackageReference Include="Serilog.Sinks.Console" Version="[redacted]" />
    <PackageReference Include="Serilog.Sinks.MSSqlServer" Version="[redacted]" />

Versions redacted for security paranoia but with recent versions of these we don't see the runaway memory consumption.