dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.77k stars 3.19k forks source link

SQL Server bulk insert broken when new command applies pending commands and goes over the batch parameter limit #29539

Closed gaglia closed 1 year ago

gaglia commented 1 year ago

Hi,

I'm experiencing strange behavior using EF 7 with the generated SQL to perform the insert on database.

When I go to save a series of tables through SaveChanges two records of a table are not inserted in the SQL Server database. From my first analysis it seems that the problem occurs when there is a consistent number of records to insert for the same table so that the SQL generated by EF 7 is broken into different inserts.

Analyzing with SQL Server Profiler I found that two of these inserts are not created correctly in the SQL generated by EF 7 therefore only 64 records are inserted on 66 entities of type veDocumenti_Righe

I have attached the detail of the DbContext before the SaveChanges where you can see that there are all 66 entities in the state of Added

However, analyzing the generated SQL code, I saw that for the two rows that are not actually inserted, the insert statement in the database is missing. If you analyze the sql code that I attached in the SQLInsert01.sql file you will see that for example the @p2165 parameter is declared and valued correctly but is not then used in the insert

I await your kind feedback Thank you

Provider and version information

EF Core version: 7.0 Database provider: Microsoft.EntityFrameworkCore.SqlServer Target framework: .NET 7.0 Operating system: Win11 IDE: Visual Studio 2022 17.4

SQLServerProfiler_Insert.zip

roji commented 1 year ago

@gaglia I'll take a look, but it's typically difficult to get to the root cause of a bug from a few SQL statements.

Are you able to submit a minimal, runnable code sample that reproduces the error?

gaglia commented 1 year ago

@roji I guess it's not easy I understand very well. And that unfortunately it is not even that simple to be able to extrapolate a test from my project but I try! I hope to be able to give it to you as soon as possible Thank you

roji commented 1 year ago

@gaglia it's likely that this is a duplicate of #29502 (also raised in #29562).

I've fixed this in our main branch; can you please try the latest daily build and report whether the problem still occurs?

gaglia commented 1 year ago

Hi @roji

I'm trying but I still get the problem. I have a doubt to use the latest daily build correctly but I think I'm doing it correctly:

I created the Nuget.config file as mentioned in the guide:


<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <disabledPackageSources>
   <clear />
  </disabledPackageSources>
  <packageSources>
    <clear />
    <add key="dotnet7" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet7/nuget/v3/index.json" />
    <add key="nuget.org" value="https://api.nuget.org/v3/index.json" />
  </packageSources>
</configuration>

I then used wildcards in the PackageReference:

<ItemGroup>
  <PackageReference Include="Microsoft.EntityFrameworkCore" Version="7.0.0-*" />
  <PackageReference Include="Microsoft.EntityFrameworkCore.Design" Version="7.0.0-*">
  <PrivateAssets>all</PrivateAssets>
    <IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
  </PackageReference>
  <PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer" Version="7.0.0-*" />
</ItemGroup>

Is that enough? Or should I use version 8.0.0-alpha.1.22416.5 ?

Using 7.0.0-* the problem still exists, I checked what you indicated here #29502 and here #29562 but I was just using HasTrigger():

entity.ToTable(tb => tb.HasTrigger("TR_veDocuments_Rows__Insert"));

The model was generated using EF Core Power Tools and it had already automatically entered the indication of the trigger

On SaveChanges I don't get any error but 2 entities out of 66 of type veDocumenti_Righe are not inserted because they are "ignored" by the SQL code generated by EF as indicated above.

I'm trying to create a mini project by pulling it out of my main project but I can't seem to replicate the problem right now. The problem I can only replicate in my main project when I have to insert more than 40 entities of type veDocumenti_Righe at the same time

Let me know if I have to try version 8.0.0-alpha.1.22416.5, meanwhile I'll continue to see if I can create a project for you to replicate the problem

Thank you

roji commented 1 year ago

@gaglia the fix hasn't been merged to 7.0 yet, and we don't generally publish daily build packages for patch releases; so yes, you need to use the latest 8.0.0-alpha daily build.

If you're not sure which version you're building, do dotnet restore and then check obj/project.assets.json for the precise version.

gaglia commented 1 year ago

@roji unfortunately the problem persists even with version 8.0.0-alpha.1.22416.5

The problem keeps happening on my main project which has a model with about 800 mapped tables

I'm trying to replicate the problem with a small model of my project to give you the solution to analyze but I haven't been able to yet. It seems that with the smaller model the problem does not arise.

I'm trying to replicate the same situation I hope to succeed as soon as possible

I am attaching the project.assets.json file to make sure I tried with the right version :) project.assets.zip

Thank you

roji commented 1 year ago

8.0.0-alpha.1.22416.5 corresponds to commit a999af786f5ff4196630415bb1fd95c3ff9c55e2, which is from August; this is likely because our daily build still point to the dotnet7 feed. Change your nuget.config to point to the dotnet8 feed instead:

<add key="dotnet8" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet8/nuget/v3/index.json" />

After that, I'm getting package 8.0.0-alpha.1.22567.1, which corresponds to commit 8fdc756b3eef98364f4de3f4eed4dcf6b29329e2 which has the fix.

gaglia commented 1 year ago

Sorry @roji now I corrected the NuGet.config and I tried both with the version indicated by you 8.0.0-alpha.1.22567.1, and with the latest available yesterday 8.0.0-alpha.1.22570.1 but unfortunately the problem in my project is still present.

I'm trying to make a test project that I can give you where to replicate the problem but as mentioned earlier I'm having trouble replicating the problem in another simpler project.

gaglia commented 1 year ago

Hi @roji I finally managed to replicate the problem in a smaller project that I can give you.

I tried to put in the code comments what happens.

Quickly summarize in this case I have 80 entities of type veDocumenti_Righe in Added state that should be saved but after the SaveChanges only 57 are inserted the other 23 remain in an incorrect state with the primary key at 0

I'm trying with version 8.0.0-alpha.1.22571.6 In the SQLScripts folder of the WpfApp.DataAccess project I placed the SQL scripts to create the tables and data to start the project.

If you need any other information, I'm available

A thousand thanks

EFCoreTest.Solution.zip

roji commented 1 year ago

@gaglia thank you, I will look at your repro very soon.

roji commented 1 year ago

@gaglia I took a look at your repro, but was unable to get it to work; it's a pretty big WPF application without clear instructions on how to create and/or seed the database. I added db.Database.EnsureCreated() in MainWindow.xaml.vb.ButtnoNewDoc_Click but got some NullReferenceException.

Can you please narrow this down to a minimal repro, ideally in a simple, console program (no need for WPF), without extra code that isn't necessary for the repro, and with clear instructions on how to reproduce the exception?

gaglia commented 1 year ago

Hi @roji I have created a new console application as you suggested.

To create the database I created a sql script to be run manually: CreateDatabase.sql You can find it inside the solution in the App project

At that point launching the execution of the program generates the problem.

I tried to describe to you what happens in the code comments:

var varEntities = db.ChangeTracker.Entries().Where(entry => entry.State == EntityState.Modified || entry.State == EntityState.Added);

//80 rows of type veDocuments_Rows to added
var linesToAdded = varEntities.Where(e => e.Entity.GetType().Name == "veDocuments_Lines").Count();

db.SaveChanges();

//After SaveChanges see in sql management studio 57 rows added no 80
//SELECT *
//FROM veDocuments_Rows
//WHERE IdDoc_T = (SELECT TOP 1 IdDoc_T FROM veDocuments ORDER BY IdDoc_T DESC)

// 57 rows added instead of 80
int linesAdded = document.veDocuments_Lines.Where(r => r.IdDoc_R > 0).Count();

// 23 rows not insert with identity primary key IdDoc_R = 0
int linesNoIsert = document.veDocuments_Lines.Where(r => r.IdDoc_R == 0).Count();

I hope so you can be able to test the application.

Thanks again

EFCoreTest.SolutionNew.zip

roji commented 1 year ago

I can repro this and can confirm there's a bug.

The problem occurs when there are pending bulk inserts, and a new command comes in which is incompatible with them (so causes the pending commands to get applied), but when it's added after that, causes the batch to go over the parameter limit. The problem is that in ReaderModificationCommandBatch, we snapshot the SQL builder and ResultSetMapping positions, and if we fail the validity checks, we roll back to them; but since adding a command does two things in this scenario (apply pending bulk inserts AND add a new command), both get rolled back and the pending bulk inserts disappear.

gaglia commented 1 year ago

@roji thanks for the feedback

I await your feedback to possibly try a fix as soon as it is available

Thanks again

roji commented 1 year ago

After this is merged, I'll post back here with the daily build version which you can then try and confirm. I'm also pretty confident we'll be patching 7.0 with this fix.

ErikEJ commented 1 year ago

Reopen for patching?

roji commented 1 year ago

Yep, am preparing the PR now.

gaglia commented 1 year ago

Hi @roji is it possible to know when the EF 7 patch with the fix for this problem will be released ?

Why should I schedule a production release of our software with EF 7

A thousand thanks

roji commented 1 year ago

@gaglia the patch PR is currently (#29669) out but not yet merged; there's a chance this may be released in the January patch, but if not, in February. In the meantime I recommend using the daily build - it should be just as stable. Otherwise, you can reduce the maximum batch size to 1 to work around the issue, though that will impact performance in a significant way.

crowet commented 1 year ago

This failed one of my unit tests when upgrading from v6 to v7. I came up with a similar solution, though mine was more of a hack directly with the SqlBuilder as I was having a difficult time figuring out the correct placement of the fix. I prefer this solution and will be using it in my own code directly as a copy until the Jan/Feb release.

My workplace doesn't permit pulling from the daily build feed as suggested above, so instead might I suggest creating a copy of the SqlServerModificationCommandBatchFactory and SqlServerModificationCommandBatch classes using the fixed source code branch and replacing the namespace with Microsoft.EntityFrameworkCore.SqlServer.Update.Internal.Hotfix for both. I then place both files in my project containing my DbContext configuration. When configuring I replace the services by calling ReplaceServer. i.e. options.UseSqlServer(...).ReplaceService<IModificationCommandBatchFactory, Microsoft.EntityFrameworkCore.SqlServer.Update.Internal.Hotfix.SqlServerModificationCommandBatchFactory>().

Ran my unit test again, and presto, everything now passes!

ajcvickers commented 1 year ago

@crowet Nice. Good to see the internal D.I. system working as it should.

crowet commented 1 year ago

@ajcvickers off topic but fyi, being able to replace any internal component in the system has been a tremendous benefit for me. I use it all the time when I want to patch some quirk I don't like, especially in how I manage migrations (where I pull the branch for the tagged version in github and copy and change code) and it works much better than I thought it would. I currently manage legacy stored procedures, functions, and views as well as seed data in a fork migration in a separate assembly where it inherits all the migrations of the schema base with custom __MigrationHistory_Seed_Data table to manage the seed data and __MigrationHistory_Legacy to manage the stored procedures, etc. That forking mechanism works by subclassing the Migrator class and using the internal D.I. to replace the IMigrator service with my own internal version. It works like a charm and requires little changes every major version update. There is a competing migration product that can manage all this very well (even better IMO) but doing it all in EF Core with custom extensions means I get to break free of windows dependencies that the other vendor requires. Also, by it being open source means I have more control to fearlessly swap things out I don't like.

roji commented 1 year ago

@crowet that's really nice to hear! Note the distinction between internal services and APIs (anything in an Internal namespace or annotated with an [EntityFrameworkInternal] attribute), and public ones. We are far less likely to make breaking changes to public ones, so it's a good idea to stick to those whenever possible. You can still override and do stuff with internal APIs, but then you're running more risk of being broken by major version changes.

If you run into a case where you find yourself needing to interact (or replace) an internal API, and you think a public API is missing, let us know.

crowet commented 1 year ago

@roji That's good advice! Unfortunately, the type of things I needed to do I had to get my hands dirty in the internal namespaces. Overall, I've only have maybe 1-4 hours of rework every other major release (~every 2 years), and it's mostly trivial so far, like trying to figure out what new dependencies I need to add to my function. I can't think of any public apis other than my usage scenarios I mentioned in the above previous post. I like to keep my seed data separate from my main schema as well as full management of legacy stored procedures, etc. for Sql Server so that I can easily build up a CI pipeline to unit test my legacy stored procedures.

Other than that, I think it would be cool if you all had a hotfix mechanism where someone can take a series of important bug fixes from GitHub and have them automatically apply to a published release prior to the release going out, basically automating what I had to do manually other than pulling a daily build. Not sure how feasible that is, but it would be really cool. Maybe alternatively, do bugfix releases on a much faster cadence. It's painful to upgrade when you are almost done, realize there is a major bug in a key usage scenario for you, then to have to either diagnose and fix yourself or rollback and wait months for a bugfix release. I will admit though, it has come a long way, where hot fixing yourself is orders of magnitudes easier than the Entity Framework (non-core) days.

GeorgDangl commented 1 year ago

FYI, we've encountered the same issue while upgrading one app to EF Core 7.0. Our use case was similar, we had a model like the following:

public class Base
{
  public Guid Id { get; set; }
  public List<Row> Rows { get; set; }
}

public class Row
{
  public Guid Id { get; set; }
  public List<Detail> Details { get; set; }
}

public class Detail
{
  public Guid RowId { get; set; }
  public string Label { get; set; }

  public static void OnModelCreating(ModelBuilder modelBuilder)
  {
      modelBuilder.Entity<Detail>()
          .HasKey(t => new { t.RowId, t.Label });
  }
}

The data model actually represents some price calculations. So when we imported a project, we'd generate an object graphs of many Base objects with pre-filled Rows and Details. At around 70 elements, we noticed that _context.SaveChangesAsync() was failing with foreign key violations, and examining the SQL seemed to indicate that not all Row entities were saved before their Detail entities were saved, thus leading to an exception / SQL server error due to violating a foreign key constraint.

The prerelease version 8.0.0-alpha.1.22629.1 seemed to fix the problem, so I'm also looking forward to the 7.0.3 release😀