Allow DbConnection to be set/changed on an existing DbContext

sjb-sjb commented 7 years ago

Here is the situation I find myself in: I do a number of reads, and I want these to be able to overlap with each other i.e. concurrent execution, or at least async / uncoordinated requests for data. So I do the reads with separate contexts each having its own connection. So far so good. Then the user may change some of these records, and these changes are tracked by the various contexts that were used to read them. Also fine. Now I want to write all the changes back in a transaction. But I can't use a transaction because the various contexts do not share the same connection. What do I do?

This question can arise in a situation as simple as a virtualized collection for a list view, where each page is read separately using a separate context. Or it could be in a situation more tied to the application design, such as different parts of the application accessing different parts of the data but sometimes needing to be presented and edited in a single place.

The only alternatives I see are the following, each of which seems to have serious problems. How am I supposed to do it?

Share a connection across the read contexts, and share the same connection to do the transacted write. Problem: reduced performance and the inconvenience of having to serialize the read requests in my code, because of not being able to do multiple concurrent reads on the same connection (? if I understand correctly this is a limitation, if not in the API then in the db).
Detach the entities after reading them and immediately reattach them all to a single context; track the user changes in that context and use that context to write. Problem: takes time to detach and attach, seems unaesthetic since you have all the entities glommed into a single context, also may not be suitable to the application architecture to have a single centralized context, context may be long-lived, etc.
Detach the entities after reading them, modify (some of) them outside of any context, then put the changed entities back into a single context just before writing. Problem: would have to do your own change tracking, which is a lot of work and which would require a lot of data structures essentially redundant with the EF change tracking, identity resolution, entity collections, etc.
Keep the entities in the contexts that were used to read them and track changes there; but before doing the write, change the connections of the various contexts so that they all share a single connection; then use this shared connection to do the transacted write. Problem: doesn't seem to be any way to do this in the API, even though the original connections are all closed / not being used. (Would also have to set the connections back again if you wanted to reuse the contexts for more reads).
Keep the entities in the contexts that were used to read them, with change tracking done in those contexts; before the write, move all of the changed entities along with their tracked changes into a new context and use that single context to write. Problems: not sure how to move the tracked changes from one context to another; also it might not suit the application architecture to have entities moving around from one context to another, so it might be necessary to move them back again after the write (or else copy the entities and then accept changes in the various original contexts).
Use DTC or some other large-grained transaction mechanism. Problem: too heavy and inefficient to use for a single database.
Use an ambient transaction i.e. TransactionScope. Problem: it has been nixed from EF Core, also not obvious that it would work when the contexts all have separate connections created outside the scope of the TransactionScope (which will be the case if the contexts are used for reads in the way that I suggest).

So ... of the above, the most palatable would seem to be option 4. However I believe this would require a change to EF to enable changing the connection in a DbContext (throw if the connection is not closed at the time the change is attempted).

Other than option 4, option 5 would seem to be the next best, and it seems feasible. But if you have to copy all the changed entities into a single context to write them, then what is the point of cross-context transactions in the first place?

It feels like transactions are not doing anything for me on an application-wide / cross-context basis, only within a single DbContext.

Thoughts on how to do this? Am I missing something obvious?

sjb

ajcvickers commented 7 years ago

@sjb-sjb We will look into allowing setting/changing the DbConnection on a context--number 4 above.

Note for implementer: see #8427, since setting the connection string is not dissimilar to setting the connection.

sjb-sjb commented 7 years ago

Thanks, @ajcvickers !

In terms of the design of the enhancement, I would say this is more about transactions than about the connection per se. What we need is an extension method like this :

static public async Task SaveChangesAsync<T>( IEnumerable<T> dbcontexts) where T: DbContext

This method would save the existing connections in the dbcontexts, set a new shared connection, start the transaction, await the SaveChanges async of each context, and finally restore the connections in the contexts. Some thought would have to be given to the error handling but presumably it would be similar to the approach used in the single-context SaveChanges async. An exception should be thrown if any of the contexts have open connections when the method is called.

ajcvickers commented 7 years ago

Consider also #9244 when working on this.

sjb-sjb commented 6 years ago

Just a comment on this. Eventually I threw out the approach I had been using and went to something that is probably more compatible with the "unit of work" concept. Namely, I dispose the contexts after using them to read entities. The entities are then managed on a standalone basis. If it is necessary to write, then I use a new context for all of the changes and attach the entities to that new context. This is simpler and avoids the problem of having multiple contexts in use at the same time. It is a variation on (2) of the original post but I don't reattach the entities until/unless a change to the entity is to be saved. So I no longer am in active need of a way to transact across contexts. The flip side is one must live with the possibility of concurrency conflicts in case two sets of changes from the same UI happen to overlap, however in practice this is not too much of a problem. Also one has to be able to deal with concurrency conflicts anyway.

ajcvickers commented 6 years ago

Removing this from milestone for re-triage. We've seen several requests for reads using one connection and writes using another connection, which would seem to be best utilized by allowing this. Also, the scenario of sharing a connection for transactions would be made a lot easier by enabling this--see #13041.

marcwittke commented 5 years ago

The only thing that I would need, is having the RelationalConnection.ClearTransactions() as public method:

My architecture for a web application is like this:

Connections are opened by the DI-infrastructure (to have guaranteed disposal, and to be able to use one transaction spanning more DbContext instances of different types)
The DbContextOptions instance has a scoped lifetime, and is being created with .UseSqlServer(connection)
A UnitOfWork instance takes care of opening a transaction on the connection, and passing it to the DbContext via .UseTransaction()

A very special request type needs to commit the transaction, do something "long" (Sending SMTP), and use the DbContext again with a new transaction. This is due to increased deadlock probability during the "long" operation, that is designed in a way that do not require locked data.

To prove the concept i tried this:

public T CompleteCurrentTransaction_InvokeFunction_BeginNewTransaction<T>(Func<T> func)
{
    Commit();
    T result = func.Invoke();
    ResetTransactions();
    BeginTransaction();
    return result;
}

private void ResetTransactions()
{
    RelationalConnection relationalConnection = (RelationalConnection)DbContext.Database.GetService<IDbContextTransactionManager>();
    MethodInfo methodInfo = typeof(RelationalConnection).GetMethod("ClearTransactions", BindingFlags.Instance | BindingFlags.NonPublic);
    methodInfo.Invoke(relationalConnection, new object[0]);
}

It seems to work. However, I am not so sure about the practical value and all the assumptions this approach brings, in addition to call a private method via reflection.

dotnet / efcore

Allow DbConnection to be set/changed on an existing DbContext #8494