dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.65k stars 3.15k forks source link

Support AddAsync && AddRangeAsync #2291

Closed davidroth closed 8 years ago

davidroth commented 9 years ago

If EF7 is configured with GenerateValueOnAdd(true), sql server sequences are used, and so every time Add(entity) is called, EF7 will send a query to the store so that it can assign a new key to the entity.

I remember I have seen AddAsync && AddRangeAsync overloads in previous Ef7 releases. Why did you remove them?

I found some notes here: https://github.com/aspnet/EntityFramework/wiki/Design-Meeting-Notes:-January-16,-2015 ("Update: we decided to remove AddAsync entirely") but i could not find any reasoning for this change.

divega commented 9 years ago

Good question! Sorry we left the details out of the notes.

The removal of AddAsync() was a compromise and (as usual) one that we are very open to get feedback on.

Our understanding of how to best apply the Task-based async pattern to API designs is still somewhat evolving. Despite how powerful the support for async programming is in C#, synchronous APIs are still easier to use correctly in many circumstances. In particular, it is hard to use async APIs correctly when from synchronous code :smile:

In principle we create async versions of APIs to enable server applications to scale better by saving threads that would otherwise be wasted waiting for I/O operations to be completed.

Following that principle we originally introduced AddAsync()in the very first prototype of EF7 to handle asynchronous implementations of value generators. But when we later on started fleshing out the rest of the API we found additional places where we had to either call into AddAsync() or implement the same behavior as AddAsync(), among others:

Suddenly async had become “viral”: any time you use await the whole call stack has to be async, so we needed/wanted ChangeStateAsync(), DetectChangesAsync() (AFAIR we actually implemented those two), but how do we figure out a way to make setters on navigation properties be async?

It just looked like a slippery slope in which we would end up making our API much more complex in exchange for very little value. (After all, correctly implemented value generators should not be chatty on the network. E.g. we use a default value block size of 10, meaning that only once every ten calls to Add() will actually cause a network round trip. If this isn’t good enough, you can increase the block size or switch to a client-based value generator.) We didn't want to go down the path of adding all async versions of all those methods and hence we decided to remove it entirely. Did we make the best choice? We don't know with certainty.

Another way to approach the problem is shown in the Stream class, in which you can call FlushAsync() before you call Close() or Dispose(). That flushes any pending writes asynchronously and prevents the latter two from triggering an implicit synchronous Flush() blocking the thread. Perhaps we could bring back only AddAsync() with a similar usage and purpose and avoid all the DetectChangesAsync() ChangeStateAsync(), etc. That said, at this point this isn't really a super high priority (especially because as I said, value generators shouldn't be too chatty) so we will wait for customer feedback on it.

divega commented 9 years ago

Note for triage:

After thinking about this again while writing the explanation above I came to the conclusion that it would make sense to bring back AddAsync(), but not DetectChangesAsync() or any other "async by contagion" APIs under DbContext. As I as explained in the last paragraph, customers could use AddAsync() in their code to make sure value generation is performed asynchronously (this would also imply bringing back the async paths for value generation) before they call into other APIs that would implicitly trigger synchronous value generation.

That said, this doesn't seem to be high enough priority to do it in the EF 7.0.0 release so we can either move the issue to the backlog or close it and wait for additional feedback.

davidroth commented 9 years ago

TLDR; Thanks @divega for your detailed explanation!

I understand your concerns about async having "viral" characteristics. I have some feedback though:

To my knowledge there are 2 scenarios were value generation kicks in:

- Calling Add() or AddRange() orDetectChanges()

This is an obvious explicit call to the EF Apis and the developer knows that value generation may take place. I think it would make sense to bring AddAsync() and AddRangeAsync() back here and I also think that DetectChangesAsync() makes sense for consistency. This would be especially useful for rich client scenearios.

- Setting a navigation property to an attached graph

EF immediately runs value generation for the added property if the target entity implements INotifyPropertyChanged

This is a non obvious implicit operation which may not be expected/wanted based on the application type. It may be ok in an web application where it is expected that multiple queries take place in the current request (one logical flow of execution)

But for example in a rich client scenario with direct database access, it is not desired to have implicit db queries running on the main application thread. It is not possible to handle possible network failures and the application may hang some while if the network connection is currently slow or the operation takes a lot of time. Furthermore, if there is some code which assigns multiple new navigation property values to a graph, there may occur N+1 problems, because every property assignment would trigger a database query. This is also something that also affects the web scenario.

It is obviously not possible to provide an async api for setting a navigation property on an entity as the property is not async by nature. So I think EF7 should provide a configuration setting, to disable "implicit" value generation when adding entities to a context. If the developer would decide to disable it, only explicit API calls through Add, AddRange ,DetectChanges and SaveChanges() would trigger value generation.

EF7 already has this non-implicit value generation behavior when adding a child to a collection of an entity attached to a context. For example if you have a look at the following model:

public class Blog
{
     public ObservableCollection<Post> Posts { get; set; }
}

modelBuilder.Entity<Post>()
  .Property(x => x.PostId).GenerateValueOnAdd();

and the following code:

using (var context = new BlogContext())
{
    var blog = new Blog();
    context.Add(blog);  

    var post = new Post();
    blog.Posts.Add(post); // Does not trigger value generation, even though blog.Posts implements INotifyCollectionChanged

    context.SaveChanges(); // Triggers value generation for "post"
}

EF7 will not trigger value generation when the post is added the the Posts collection. However EF7 does trigger value generation if an entity is added to a navigation property when the Blog implements INotifyPropertyChanged:

public class Blog : INotifyPropertyChanged
{
     public event PropertyChangedEventHandler PropertyChanged;
     public User User { get; set; } // implement and raise event.
}

modelBuilder.Entity<User>()
  .Property(x => x.UserId).GenerateValueOnAdd();

and the following code:

using (var context = new BlogContext())
{
  var blog = new Blog();
  context.Add(blog);    

  blog.User = new User(); // Value generation triggered during this call.
}

I am not sure if this is "by design", or just missing in the current "beta 4". But there is no obvious reason of the different behavior here.

Conclusion

I am looking forward to your feedback :)

GSPP commented 9 years ago

Note, that it is not necessary to make everything async from an efficiency standpoint. It is totally OK to use EF mostly synchronously and only make time-consuming queries async. Async does not make operations any faster, it merely saves a thread while the IO is running. The thread saving benefit is proportional to the time that the async operation takes.

Your reasoning is correct: Not all APIs must be async from an efficiency standpoint.

rowanmiller commented 8 years ago

Be sure to update and reenable the test from https://github.com/aspnet/EntityFramework/issues/5254 when this is done.