dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.7k stars 3.17k forks source link

ExecuteUpdate/Delete (AKA bulk update, without loading data into memory) #795

Closed akirayamamoto closed 2 years ago

akirayamamoto commented 10 years ago

EF does not provide a batch update mechanism. A proposal is below. Context.Customers.Update().Where.( c => c.CustType ==“New”).Set( x => x.CreditLimit=0)

Will you consider this feature? More details here: https://entityframework.codeplex.com/workitem/52


Design summary: https://github.com/dotnet/efcore/issues/795#issuecomment-1027878116

smitpatel commented 5 years ago

@ilmax - I may not have time to look at the code in near future but you are in the right direction in terms of Update pipeline needs to have a way to talk to query pipeline. We don't know what exact way it would be. Perhaps different layering of query pipeline in 3.0 would provide better infra for that.

Seanxwy commented 5 years ago

This ancient demand, however, has not been realized for so long

divega commented 5 years ago

For clarity can you elaborate what you mean by cascade operations? The only other mention of cascade in this thread is for cascade on delete.

@Tarig0 I was referring primarily to on delete cascade actions. Currently, when an entity is deleted and orphan dependents are supposed to be deleted, we rely on a combination of deleting any dependent that is already in memory with a key-based DELETE statements and with ON DELETE action being configured on the FK constraint to take care of any dependent entities that are not in-memory.

In the past we have discussed in the team about not requiring either, and instead issuing DELETE statements predicated on the value of the FK directly. This could be more efficient, but also not having the foreign keys configured with ON DELETE actions could be good because SQL Server has limitations with cycles of cascade behaviors.

For completness, ON UPDATE actions aren't immediately interesting to EF Core is that we assume all the keys governing relationships are immutable. But if that changes in the future, e.g. if we want to enable mutating alternate keys, then having ON UPDATE actions, or being able to issue UPDATE statements based on the FK values may become more interesting (see https://github.com/aspnet/EntityFrameworkCore/issues/4073#issuecomment-305942376).

borisdj commented 5 years ago

I have just added Batch Ops (Update, Delete) into EFCore.BulkExtensions Batch Extensions are made on IQueryable DbSet and can be used in the following way. They are done as pure sql and no check is made if some are prior loaded in memory and are being Tracked.

context.Items.Where(a => a.ItemId >  500).BatchDelete(); // .BatchDeleteAsync();
context.Items.Where(a => a.ItemId <= 500).BatchUpdate(a => new Item { Description = "Updated" });
var incrementStep = 100;
context.Items.Where(a => a.ItemId <= 500).BatchUpdateAsync(a => new Item {
    Description = "Updated",
    Quantity = a.Quantity + incrementStep 
});
BalassaMarton commented 5 years ago

BulkExtensions is impressive, but it takes away one of the core features of EF: your application code being independent of the database provider.

borisdj commented 5 years ago

Currently it supports (Ms)SQL Server and SQLite. I also did research on extending it for PostgreSQL(Npgsql), which is next on ToDo list.

thongdoan commented 5 years ago

Can we do something like: Insert:

ctx.CreateStatement<Bar>()
   .Set(b=> b.Title, "Hello")
   .Set(b =>b.Message, "World")
   .ExecuteInsert();

Update:

ctx.CreateStatement<Foo>()
   .Set(f => f.Avatar, "avatar-1.png")
   .Set(f => f.FullName, f => f.FirstName + " " + f.LastName)
   .Set(f => f.Sum, f => f.Sum + 3)
   .Where(f => f.Id < 3)
   .ExecuteUpdate();

Delete

ctx.CreateStatement<Foo>()
   .Where(f => f.Id > 3).
   .ExecuteDelete();
MaklaCof commented 5 years ago

I prefer syntax from ZZZ projects (GitHub).

// DELETE all users which has been inactive for 2 years
ctx.Users.Where(x => x.LastLoginDate < DateTime.Now.AddYears(-2))
               .Delete();

// UPDATE all users which has been inactive for 2 years
ctx.Users.Where(x => x.LastLoginDate < DateTime.Now.AddYears(-2))
              .Update(x => new User() { IsSoftDeleted = 1 });
roji commented 5 years ago

@MaklaCof instantiating a new User object on the heap just to express an update operation seems excessive... But the API could just as well simply allow you to modify properties on the given User object directly.

stevehansen commented 5 years ago

@roji expression trees aren't executed, so the new User() would never be invoked but the expression can be used to see which properties are accessed so that it can be translated to the correct SQL.

roji commented 5 years ago

@stevehansen of course you're right, had a confused moment there. But it still seems a needlessy verbose compared to a simple Update(x => x.IsSoftDeleted = 1) for a single property change, and a block lambda for multiple.

EDIT: Unfortunately assignment isn't currently supported in C# expression trees

thongdoan commented 5 years ago

It look great, but i think it is better if we can do something like:

bool UpdateBar(UpdateBarRequestModel model) 
{
        var smt = ctx.Bars.Where(x.Id == model.Id)
                                .AddChanges(x => new Bar
                                {
                                    Name = model.Name != null ? model.Name : x.Name,
                                    Description = model.Desc != null ? model.Desc : x.Desc,
                                    Notes = model.Notes != null ? model.Notes : x.Notes
                                });
        return smt.ExecuteUpdate() >= 0;
}

void CreateOrUpdateBar(CreateOrUpdateBarRequestModel model)
        {
            var smt = ctx.Bars.Where(x.Id == model.Id)
                              .CreateChangesStatement();

            if(model.Name != null)
                smt.AddChanges(x => x.Name = model.Name);
            if(model.Desc != null)
                smt.AddChanges(x => x.Desc = model.Desc);
            if(model.Notes != null)
                smt.AddChanges(x => x.Notes = model.Notes);

            if (model.Type != null)
            {
                smt.AddChanges(x => x.Type = model.Type);

                switch (model.Type)
                {
                    case BarType.Simple:
                        smt.AddChanges(x => new Bar
                        {
                            OtherField1 = 1,
                            OtherField2 = 2,
                            OtherField3 = 3,
                        });
                        break;
                    case BarType.Complex:
                        smt.AddChanges(x => new Bar
                        {
                            OtherField3 = 3,
                            OtherField4 = 4,
                            OtherField5 = 6,

                        });
                        //It will be great, if we can add new condtion:
                        smt.AddWhere(x => x.OtherField4 == 'Can_Become_Complex')
                        break;

                }

            }
            if(smt.ExecuteUpdate() <= 0) {
                smt.AddChanges(x => x.Id = model.Id);
                smt.ExecuteInsert();
            }
        }
roji commented 5 years ago

@thongdoan don't forget there's DbContext.SaveChanges() for actually executing inserts/updates, so introducing a new Execute doesn't seem to make much sense. It's also not very clear what the added AddChanges()/AddWhere()/Execute*() complexity gets us compared to a simple Update() API.

thongdoan commented 5 years ago

@roji you are right, i just tried to find a flexible way, it look complexity, let skip it :)

WeihanLi commented 5 years ago

Any plan for this feature?

GeraudFabien commented 5 years ago

On EF I used to use dapper for bulk and EF for everything else. Since dapper use the dB connection of the dB context it work well. On EF core I didn't find a way to use the two side by side (and since I didn't have a real need for now I use dirtier solutions ). Why not just make a official package to use the two side by side?

ajcvickers commented 5 years ago

@WeihanLi This issue is in the Backlog milestone. This means that it is not going to happen for the 3.0 release. We will re-assess the backlog following the 3.0 release and consider this item at that time. However, keep in mind that there are many other high priority features with which it will be competing for resources.

@GeraudFabien If there is something specific that we could do that would make using Dapper and EF Core together easier, then can you please file a new issue and include full details.

WeihanLi commented 5 years ago

@ajcvickers thanks for your info, hope it'll release asap

BalassaMarton commented 5 years ago

@ajcvickers The key feature missing is the ability to generate commands that do the actual delete/update. Do you think this can be achieved with extending infrastructure types like UpdateSqlGenerator? How much effort would be to extend these types for mass update and mass delete?

ajcvickers commented 5 years ago

@BalassaMarton The challenging work here is not generating the SQL, but rather defining the API experience and behavioral interactions with EF Core--for example, what happens to tracked entities?

BalassaMarton commented 5 years ago

@ajcvickers that's true from a framework architecture point of view, but to solve the actual problem for the users, exposing an API that can generate the SQL could be sufficient. I wouldn't mind if bulk operations threw an exception when the context already has tracked entities. The problem with this lack of bulk functionality is that even though we are programming against an entity model, we are forced to write raw SQL updates for scenarios where tracking entities just won't scale.

djechelon commented 5 years ago

@ajcvickers I wrote a comment on this topic long time ago (https://github.com/aspnet/EntityFrameworkCore/issues/795#issuecomment-219519060) and I would still like to return to the point.

IMO a feasible solution could be running the Update both to the SQL datasource and into the Entity cache

I assumed that if Context has tracked entities they can be easily identified by their Type in the context cache, and the lambda update statement can be run on that the same way we manipulate a List<T> in memory.

That assumes the changes are simple and "independent". What is challenging is to run a SQL statement that massively updates data into the data source when at least one of the column was already updated in the context. We could start listing some simple and then complex examples and their expected result to find patterns and showstoppers.

Example: what happens if you first change the Salary of a tracked Employee, then bulk change the Bonus of all employees to a percentage (read: "formula") of their salary? What is the final state of the tracked employee?

//Given
Employee bob = context.Employees.byId();
//bob.Seniority=10;

//When
bob.Salary = 25500;
context.Employees.Where(e=>e.Seniority > 5).Update(e=> e.Bonus = e.Salary / 20 ); //5%
await context.SaveChangesAsync();

//Then
//Assert that two statements have been executed
    UPDATE Employees set Salary = 25500 where Id = 'Bob';
    UPDATE Employees set Bonus = Salary / 20 where Seniority > 5;
//Assert Bob bonus has changed in the tracked context, but not propagated to a SQL statement
//Bob.Bonus = 1275

That was a way simple example indeed

Seanxwy commented 5 years ago

is very cool, but has not been achieved, perhaps the real reason is lazy

wizofaus commented 5 years ago
bob.Salary = 25500;
context.Employees.Where(e=>e.Seniority > 5).Update(e=> e.Bonus = e.Salary / 20 );
bob.Salary = 30000;
await context.SaveChangesAsync();

Would this need to produce

 UPDATE Employees set Salary = 25500 where Id = 'Bob';
 UPDATE Employees set Bonus = Salary / 20 where Seniority > 5;
 UPDATE Employees set Salary = 30000 where Id = 'Bob';

?

MeTitus commented 4 years ago

4 years later and this is still open, when can we expect this to be implemented?

Thanks

AndriySvyryd commented 4 years ago

@MeTitus As soon as it gets through our planning process above other features.

GeraudFabien commented 4 years ago

If you have read you know that you can also use Dapper side by side with ef [core]. If you want it mods now

Seanxwy commented 4 years ago

I think linq2db is a good choice, better than dapper

mojtabakaviani commented 4 years ago

please add to milestore 5

ajcvickers commented 4 years ago

See also #19631

BalassaMarton commented 4 years ago

@ajcvickers what's the recommended workaround/solution for these use cases, according to Microsoft? I'm curious why there's no resolution whatsoever after 6 years, while I can't imagine a project with a database but without massive inserts/updates/deletes.

ajcvickers commented 4 years ago

@BalassaMarton The way we decide how to prioritize issues is documented in the release planning process. This is issue currently eighth in the most voted issues list.

AndriySvyryd commented 4 years ago

@BalassaMarton If this is a perf bottleneck in your project you'll have to drop down to raw SQL as a workaround.

AndriySvyryd commented 4 years ago

Related: https://github.com/dotnet/efcore/issues/18990

BalassaMarton commented 4 years ago

@BalassaMarton If this is a perf bottleneck in your project you'll have to drop down to raw SQL as a workaround.

That's what I'm trying to avoid, as it makes refactoring and tracking table/column usage way harder. In hindsight linq2db would have been a better fit for my current project, but in general EF Core seems favorable as a repository/unit of work implementation.

colotiline commented 4 years ago

@BalassaMarton

it makes refactoring and tracking table/column usage way harder

Did you try to use nameof?

BalassaMarton commented 4 years ago

That just makes the inline scripts even harder to read.

colotiline commented 4 years ago

@BalassaMarton use 2 ORMs? EF Core and Linq2Db?

BalassaMarton commented 4 years ago

I considered it but so far haven't had the time to fully test how it handled EF quirks like value converters and query filters.

MaklaCof commented 4 years ago

@BalassaMarton I also have very good experience with ZZZ entity framework extensions (not free!)

mrjrt commented 4 years ago

I found myself in this position a few weeks ago, at to be honest the raw SQL route wasn't too bad at all - few days worth of coding to get a set of generic methods for add/update/delete that aren't hardcoded to a given schema. I interrogate the EF model to grab the properties and annotations and it's easy enough supporting shadow properties, value convertors etc. and fire off giant parameterised single statements to the provider. Gives great performance, just wish EF did it natively.

Seanxwy commented 4 years ago

Can 5.0 be implemented again?

Seanxwy commented 4 years ago

@ajcvickers Will this feature not be implemented?

roji commented 4 years ago

@yuanxiongwei this won't be implemented for 5.0, but it's likely to be a high-priority feature for EF Core 6.0. In the meantime, you can take a look at the 3rd-party extensions mentioned above.

Seanxwy commented 4 years ago

o my god, 6.0 is still a possibility?

vslee commented 4 years ago

o my god, 6.0 is still a possibility?

He means EF Core 6.0, not EF6 (legacy).

Seanxwy commented 4 years ago

I hope so

APIWT commented 3 years ago

If anyone is looking for something that works well for update operations, but also supports insert or update (aka Upsert) I would check this out: https://github.com/artiomchi/FlexLabs.Upsert

It does not update tracked entities, so you may want to force reload them afterwards.

Seanxwy commented 3 years ago

Hopefully these features will be officially supported. Please support in 6.0

John0King commented 3 years ago

suggest:

public partial DbContext
{
    DirectAdd<TEntity>(TEntity entity, Expression<Func<IQuerable<TEntity>,bool>> condition = null);
    DirectRemove<TEntity>(IQueralbe<TEntity> query);
    DirectUpdate<TEntity>(IQueralbe<TEntity> query, Expresssion<Func<TEntity,object>>);
}

usage:


db.DirectAdd(new Blog{ name = "blog" }, b=> !db.Blogs.Any());
// INSERT INTO blogs(x,x,x)  SELECT (x,x,x)  WHERE blogs.Name WHERE not Exists( SELECT * FROM BLOG )

db.DirectRemove( db.Blogs.Where(b=>b.Name == "blog_") );
// DELETE FROM blogs WHERE blogs.Name == 'blog_'

db.DirectUpdate(db.Blogs.Where(b.Name == "blog_")), b=> new {  name = b.name+ "good blog"  });
// UPDATE Blogs SET Name = Name + 'good blog' WHERE Name == 'blog_'