dotnet / SqlClient

Microsoft.Data.SqlClient provides database connectivity to SQL Server for .NET applications.
MIT License
846 stars 281 forks source link

SqlConnection.CreateCommand() - Auto-Enlist in Pending Local Transaction #28

Open bgribaudo opened 7 years ago

bgribaudo commented 7 years ago

When CreateCommand() is called on a SqlConnection that is in a transaction, the SqlCommand returned by that method is not associated with the connection's transaction. Instead, the command must be explicitly informed of the transaction or an exception similar to the following will be thrown when the command is executed:

MethodInvocationException: Exception calling "ExecuteReader" with "0" argument(s): "ExecuteReader requires the command to have a transaction when the connection assigned to the command is in a pending local transaction. The Transaction property of the command has not been initialized."

Cost of the Manual Wire-Up

Proposal

When CreateCommand() is called on a SqlConnection in a transaction, the SqlCommand it returns should be associated with the connection's transaction, eliminating the need for the explicit wire-up.

[Added: 2019-06-03]: Alternately (perhaps better yet), change command execution so that it automatically uses the transaction associated with the connection when it's executed. With this approach, any command (whether or not created by CreateCommand()) will execute against the proper transaction. Existing logic requires that the connection's transaction is used; the change proposed here is for the command to automatically use this transaction vs. the current behavior of the command throwing an exception when the transaction hasn't been explicitly referenced.

divega commented 5 years ago

As recently announced in the .NET Blog, focus on new SqlClient features an improvements is moving to the new Microsoft.Data.SqlClient package. For this reason, we are moving this issue to the new repo at https://github.com/dotnet/SqlClient. We will still use https://github.com/dotnet/corefx to track issues on other providers like System.Data.Odbc and System.Data.OleDB, and general ADO.NET and .NET data access issues.

bgribaudo commented 5 years ago

After a cursory check, it seems like the core essence of implementing the alternate proposal would be to change a couple lines in SqlCommand from checking whether the command's transaction is set when the connection's transaction is set to setting the command's transaction to match the connection's transaction, if the former is unset and the latter is set.

Change:

if (_activeConnection.HasLocalTransactionFromAPI && (null == _transaction))
  throw ADP.TransactionRequired(method); 

To something like:

if (_activeConnection.HasLocalTransactionFromAPI && (null == _transaction))
  _transaction = // appropriate method to get connection's transaction … maybe something based off of SqlInternalConnectionTds.SqlInternalTransaction
Wraith2 commented 5 years ago

The change is quite simple as you point out but it changes default behaviour which means it isn't backwards compatible. Given the wide usage of this library you can be certain that someone somewhere has (incorrectly) used the behaviour that would change as a certainty and would be broken by changing it.

That said, the move from System.Data.SqlClient to Microsoft.Data.SqlClient isn't drop-in compatible so it does provide an idea opportunity for such a change.

CodeAngry commented 4 years ago

[Added: 2019-06-03]: Alternately (perhaps better yet), change command execution so that it automatically uses the transaction associated with the connection when it's executed. With this approach, any command (whether or not created by CreateCommand()) will execute against the proper transaction. Existing logic requires that the connection's transaction is used; the change proposed here is for the command to automatically use this transaction vs. the current behavior of the command throwing an exception when the transaction hasn't been explicitly referenced.

This would be very useful just having SqlCommand use the SqlConnection's Transaction without any need for the developer to specify. Especially since you can't execute a transaction'less Command on a transaction'ed Connection. This should all be wired internally.

cheenamalhotra commented 4 years ago

cc @saurabh500

CodeAngry commented 4 years ago

This such basic and useful thing (and most likely very easy to implement for anyone slightly involved in developing the SqlClient) is pending since 2016. And now was again moved to a new milestone. Quite amazing!

Wraith2 commented 4 years ago

I see this sort of comment far too often. If it's so easy why don't you contribute a fix for it instead of complaining?

ErikEJ commented 4 years ago

@CodeAngry you should be grateful that it is planned for the next milestone, it could have been Future or "won't fix'

CodeAngry commented 4 years ago

@ericstj The problem is not whether it is or is not planned or if I should be grateful. I coded against the stream on this for years now always having to drag the Transaction around when create commands. The problem is the milestone can keep on moving. It just moved again. It can be 3 more years before anything happens on such small but very useful feature. It's always the Microsoft way. It's all perfect but there's always some quality of life thing missing that baffles anyone that has used MS APIs or Software long enough and also makes you write a lot of pointless extra code.

@Wraith2 If you see this sort of comment far too often... it should make you think Nobody likes to waste time writing comments on Github. I'll take a look at the code during this weekend. But as I said, if I was on that team, already knowing the code, this should be a very quick and logical fix (I would not call it improvement).

I'll also check back in a few years to see how this goes.

CodeAngry commented 4 years ago

I need to ask a stupid question. Looked at the SqlCommand code and as far as I can tell the .Transaction property (the private _transaction) is just a sanity check against the user. If the user does not set it, it throws if it does not match the Connection's Transaction state. But I don't see it meaningfully used anywhere in the SqlCommand. It's checked, it's cleared but not really used.

if (_activeConnection.HasLocalTransactionFromAPI && (null == _transaction))
    throw ADP.TransactionRequired(method);

or

if (null != _transaction && _activeConnection != _transaction.Connection)
    throw ADP.TransactionConnectionMismatch();

Some ways to change this:

Keep in mind this feature hardly affects regular SqlClient users. Library (reusable code) writers would benefit the most from this.

bgribaudo commented 4 years ago

Above, there's a mention of contributing a fix for this...would Microsoft be interested in a pull request that addresses this? In other words, is the general idea conceptually agreeable--just there's a need for a resource to implement it?

ErikEJ commented 4 years ago

@bgribaudo MS absolutely accepts pull requests, I have submitted and gotten accepted several to this repo

David-Engel commented 4 years ago

You have to consider backwards compatibility in proposed changes. If you make new behaviors the default, it is very likely that you will affect the behavior of existing applications. My suggestion is to think about ways to implement this while maintaining the existing behavior by default and have the new behavior opt-in somehow.

CodeAngry commented 4 years ago

If the code was changed so when the SqlCommand.Transaction is null, all checks are dropped and the Connection's Transaction is used, the only code that would fail are Unit Tests checking for this very behavior. As no user can be affected by this. Anyone who set the wrong SqlCommand.Transaction would have thrown so leaving it null will in no way collide with users setting the correct SqlCommand.Transaction value.

From what I can tell, Transaction set by the user is checked anyway to belong to the Connection and also the Connection is checked to see if the Transaction matches. So... why is it required to be specified by the user, I don't know? It's like the library saying: Tell me the transaction, so I can test you. I already know it and I'm still gonna use my own value, but I'll test you and throw if you fail. Funny thing is the library nullifies the internal transaction when it completes so it takes this responsibility away from the user.

This strange redundant design choice leads me to think it's either historical or it's used somewhere else. Because, since this value is apparently useless, what I'm worried is that there's some voodoo functionality that relies on this SqlCommand.Transaction value within the library and I don't see it. Like reflection for example. That's why I wrote the ways this could be fixed, thinking that if someone deeply involved in coding this library sees this, they can quickly know if this is viable or not and also could implement it easily.

I'm gonna spend the new few days changing the code and testing these solutions and seeing if it works. All I need to prove is this value is not used outside of double-checking the user. If it works, even if I can't push it here, at least I'll use it for myself as I write a lot of SqlClient wrapper Apis and I need to stop worrying about the Transaction.

CodeAngry commented 4 years ago

So I fixed this in 2 lines of code. And wrote a simple test with a transaction'less command and it works.

Added this property to SqlConnection:

// if true transaction check is skipped
public bool CommandAutoTransaction { get; set; }

And changed this line in SqlCommand.ValidateCommand from:

 if (_activeConnection.HasLocalTransactionFromAPI && (null == _transaction))

to

if (!Connection.CommandAutoTransaction -- this skips the Transaction check
    && _activeConnection.HasLocalTransactionFromAPI && (null == _transaction))

And my initial test worked. In my SqlConnection I set CommandAutoTransaction = true. I create a command, execute it in a transaction without ever assigning the transaction, the command works, can be rolled back, and also executed outside the transaction without ever touching the SqlCommand.Transaction property.

My test code:

var csb = new SqlConnectionStringBuilder();
csb.DataSource = "localhost";
csb.InitialCatalog = "Test";
csb.IntegratedSecurity = true;
csb.MultipleActiveResultSets = false;
csb.Pooling = false;

using var connection = new SqlConnection(csb.ToString());
connection.CommandAutoTransaction = true;
connection.Open();

using var dropper = connection.CreateCommand();
dropper.CommandText = "drop table if exists [dbo].[Tester];";
dropper.ExecuteNonQuery(); // drop test table

using var creater = connection.CreateCommand();
creater.CommandText = "create table [dbo].[Tester] ("
    + "[TesterId] bigint identity(1,1) not null"
    + ", primary key clustered ([TesterId] asc)"
    + ");";
creater.ExecuteNonQuery(); // create test table

// create commands outside the transaction
using var inserter = connection.CreateCommand();
inserter.CommandText = "insert into [dbo].[Tester] default values;";
using var counter = connection.CreateCommand();
counter.CommandText = "select count(*) from [dbo].[Tester];"; // query number of rows

// begin the transaction
using var transaction = connection.BeginTransaction(IsolationLevel.Serializable);
Assert.IsNull(inserter.Transaction); // ensure no transaction
Assert.AreEqual(connection, inserter.Connection); // validate connection
inserter.ExecuteNonQuery(); // insert once
Assert.AreEqual(1, (int)counter.ExecuteScalar()); // verify row count = 1
transaction.Rollback(); // rollback transaction

Assert.IsNull(inserter.Transaction); // useless but tested
Assert.AreEqual(0, (int)counter.ExecuteScalar()); // verify row count = 0
inserter.ExecuteNonQuery(); // insert again (outside transaction)
Assert.AreEqual(1, (int)counter.ExecuteScalar()); // check for 1 row

Is this too easy?! I'll run some more checks.

cheenamalhotra commented 4 years ago

Hi @CodeAngry

A small bit of info for first time contributors:

CodeAngry commented 4 years ago

Thanks a lot for this. I kept testing and got a more easy solution. I'll follow these steps during the weekend with a proper branch and the new solution.

@cheenamalhotra But, how does one pass all tests? I pass most tests except some strange failures like:

Test TestNullColumnEncryptionAlgorithm fails because expected message is this:

 string expectedMessage = "Internal error. Encryption algorithm cannot be null. Valid algorithms are: 'AES_256_CBC', 'AEAD_AES_256_CBC_HMAC_SHA256'.\r\nParameter name: encryptionAlgorithm";

while my actual exception text is:

 string expectedMessage = "Internal error. Encryption algorithm cannot be null. Valid algorithms are: 'AEAD_AES_256_CBC_HMAC_SHA256', 'AES_256_CBC'.\r\nParameter name: encryptionAlgorithm";

In my exception, these 2 are reversed: 'AEAD_AES_256_CBC_HMAC_SHA256', 'AES_256_CBC' from the expected. But they are properly ordered alphabetically while the expected seems to be ordered by length.

cheenamalhotra commented 4 years ago

This test case TestNullColumnEncryptionAlgorithm is little cracky so you can ignore if it fails.

I would recommend setting up environment first with 'dotnet/master' branch and then running tests over your branch. This way you'd know what tests are flaky and you shouldn't worry about.

Also remember to run tests with Admin Command Prompt to ensure AE tests can install certificates and not fail for you.

saurabh500 commented 4 years ago

@roji Have you seen similar ask in NpgSql driver?

@ajcvickers do you have any context on whether this kind of convenience API was ever discussed earlier?

roji commented 4 years ago

@saurabh500 and others, Npgsql behaves much like the proposed behavior. PostgreSQL only supports a single transaction on a given connection at a given time, and if a transaction has been started then commands must be executed within it. So with Npgsql you don't need to manually assign the Transaction property of DbCommand - commands are always implicitly executed within the transaction.

One potential drawback of doing this regards compatibility with other ADO drivers which do support multiple transactions on the same connection. In those providers it absolutely makes sense to require transactions to be explicitly set (and not setting them may mean to execute outside of any currently-active transaction). Therefore, to promote portability between providers, it may be better to leave the requirement even for providers where it isn't necessary (even though again, Npgsql already works this way). However, this doesn't seem like a very powerful argument (providers already contain various points of incompatibility which are more serious anyway).

If the primary issue here is passing the DbTransaction down the stack (as @bgribaudo wrote above), then an alternative solution is to use System.Transactions which removes this exact pain point - connections/commands enlist implicitly in the ambient TransactionScope. This is one reason why I don't really see much value in making this change in SqlClient.

If it is decided to make this change, then I'm not sure I see the point of making it opt-in with a new public surface flag (CommandAutoTransaction). Unless I've missed something this wouldn't be a breaking change as previously-throwing invocations (transaction in progress but not set on the command) would start working. So if this is desirable, I'd just make the change and be done with it.

PS FWIW I'd avoid using the term "enlistment" here, since that's usually used in the context of System.Transactions (i.e. TransactionScope), where here we're just talking about plain old DbTransaction.

saurabh500 commented 4 years ago

@CodeAngry start a PR. That is a better way to look at code and comment on it if you want comments on ongoing work. Mark it as WIP in case you need time to complete the work.

I haven't looked at the code but few items to considered:

saurabh500 commented 4 years ago

@roji, so the DbCommand.Transaction = set a transaction is no-op for npgsql ?

CodeAngry commented 4 years ago

I'm thinking about starting a new issue with this and have the first message clearly explain everything. @cheenamalhotra should I do that or just drop it all as a post in here when the change is ready? I made another variant last night that does not need the opt-in flag and can be/is fully stealthy. So a null SqlCommand.Transaction works just as well.

I still need to do a search for any reflection use anywhere just for my peace of mind. And pass the passable tests.

@saurabh500 @roji This change would not affect any other implementation for another DB. It's a SqlServer Client only quality of life change.

ajcvickers commented 4 years ago

@saurabh500 We have definitely talked about this experience being something that we want to improve, and we have discussed how the Npgsql approach seems to work well, at least for Postgres. Unfortunately, I don't remember if the discussion went beyond that.

I'm certainly in favor of making this better. Also, after reading @roji's post I'm not trying to remember all the reasons we recommend people not to use System.Transacations. This is where I really miss Diego!

CodeAngry commented 4 years ago

Finalized the changes. The branch is: https://github.com/CodeAngry/SqlClient/tree/fix-issue-28

Given that I'm new to contributing, I'll need to be advised on how to proceed.

Purpose

Remove any awareness or dependency for SqlCommand and SqlBulkCopy to the Transaction they will be used on.

Preamble

Before I started to code this I tested to confirm parallel transactions are not supported. And you can't begin transaction client-side twice on a connection before you end the previous transaction. This means each SqlConnection has one SqlTransaction. Each SqlCommand has one SqlConnection and one (redundant) SqlTransaction.

So each connection has a transaction, each command has a connection, so why does a command need to know its transaction?

Changes Description

SqlCommand and SqlBulkCopy no longer have transaction awareness and all the responsibility is on the connection. Even if this behavior is the same as before, the transaction awareness for these 2 classes was just used for validation. Now it no longer exists.

Moved the validation into assignments, constructors and execution calls. But if the user never touches the .Transaction, that's the best use case. If the user assigns the Transaction, now validation is slightly more aggressive as completed (zombied) transactions now also throw. So if you assign a transaction to a command after a final Commit or Rollback, now it throws ADP.TransactionZombied. This make it easy to notice if a user tries to use a transaction after it's done.

Tests pass but I could not test Azure side of tests. I really hope these changes get checked by the developer that knows this library. I'm talking about the dev that understands how all moving parts in this library work together and can easily figure out if these changes touch anything I did not notice or if my assumptions are correct.

The SqlCommand and SqlBulkCopy were the main classes I wanted to forget about the Transactions. But other classes might need some attention. Not thinking they were affected by the changes, but thinking they could benefit from the same treatment.

I'll detail the changes added here:

Library Changes

Tests Changes

Fixed some failing tests not affected by the changes.

And also modified all tests (about 3) that checked to ensure the behavior that was changed in this branch. So all tests testing assigning other connection's transaction and null transaction behavior no longer apply.

Still need to write some tests and publish them. I have written tests but I made a new project for them and use MSUnitTest in them. I'll add them to the current tests, or should I just add my test project so they are separate and easy to check upfront?

Backward compatibility

Users should not be affected by the changes (on sane coding practices). Reading the SqlCommand.Transaction works and writing it validates the input. Even if a user should not touch the command's transaction anymore.

The only difference in backward compatibility is this (the assert tests say the story):

[TestMethod]
public void FailZombieTransactionSetterTest()
{
    using var connection = new SqlConnection("Data Source=localhost;"
        +"Integrated Security=true;");
    connection.Open();
    Assert.IsNull(connection.CurrentTransaction);

    using var command = connection.CreateCommand();
    command.CommandText = "select getutcdate();";
    Assert.IsNull(command.Transaction);

    using var transaction = connection.BeginTransaction();
    Assert.IsNotNull(command.Transaction);
    command.ExecuteNonQuery();
    Assert.IsNotNull(connection.CurrentTransaction);
    transaction.Rollback();

    // this should not be allowed and now it's not allowed
    Assert.ThrowsException<InvalidOperationException>(delegate
    {
        command.Transaction = transaction; // zombie assignment
    });

    Assert.IsNull(transaction.Connection);
    Assert.IsNull(connection.CurrentTransaction);
    Assert.IsNull(command.Transaction);
}

Conclusion

I don't think more changes are needed to the code. Someone should still double-check my validation patterns and make sure they are consistent with overall practices.

I also have a bunch of comments on the library.

And, finally, even if these changes do not get published, I'm using them from now on. :) My work when I write SqlClient backend apis (without EF I write mostly sproc-only) is much easier not having to drag the transaction around.

saurabh500 commented 4 years ago

@roji and @ajcvickers Thanks for the inputs.

@CodeAngry

, finally, even if these changes do not get published, I'm using them from now on. :) My work when I write SqlClient backend apis (without EF I write mostly sproc-only) is much easier not having to drag the transaction around.

Do you intend to send a Pull Request? Let use know if you need any help with it.

saurabh500 commented 4 years ago

Also, if you are modifying both SqlCommand and SqlBulkCopy, I recommend sending two different PRs so that the changes are easier to review.

CodeAngry commented 4 years ago

@saurabh500 I'll send it tomorrow. I'll also get some tests ready. But I have all the changes in one single branch. Not sure how I can PR partially. Changes are quite small in functions touched. You can see it all here and advise me on how to proceed:

https://github.com/CodeAngry/SqlClient/tree/fix-issue-28 Line counts are 214 additions and 156 deletions. It's not much.

So there's changes on:

Problem is it's all quite linked so it's better to go as a whole. One other question is: should I add more comments in the code? I tried to explain things while not writing novels. I'll comment much more in the tests, anyway.

roji commented 4 years ago

@roji, so the DbCommand.Transaction = set a transaction is no-op for npgsql ?

Yeah, it is.

cheenamalhotra commented 4 years ago

Some tests changed to fit the changes

@CodeAngry Would that mean that this is bringing breaking changes for existing applications or the tests would also pass without any test source modifications?

CodeAngry commented 4 years ago

@cheenamalhotra So, let's take this test:

using var connection = new SqlConnection();
connection.Open();
using var command = connection.CreateCommand();
command.CommandText = "select getutcdate();";
using var transaction = connection.BeginTransaction();
// Command.Transaction = Transaction; // missing assignment is fail condition
Assert.Throws(()=> command.ExecuteNonQuery()); // throws as command lacks transaction

^ This test checks that a command with no transaction belonging to a connection with an open transaction fails when executed. Such test existed. This test no longer applies because it can no longer fail. SqlCommand lost transaction awareness and it's all on the connection. So a couple tests testing for the very behavior can no longer work. Same goes for similar SqlBulkCopy tests.

I changed all of them and added a whole bunch of new tests. I'm in the process of porting them from my quick MSUnit project to the internal xunit one and from my C# 8.0 syntax to 7.3 and using() { blocks }. When I'm done I'm pushing it all. Most likely, early tomorrow as I'm EU. Then you can look over and judge/test everything.

@roji, so the DbCommand.Transaction = set a transaction is no-op for npgsql ?

Yeah, it is.

^ This is nice!

saurabh500 commented 4 years ago

@CodeAngry Why should the test change? (unless the only test that is changing is that Command Execution succeeds without any additional transaction assignment to the command)

What should change is that if the transaction is not explicitly assigned, then the command inherits from the connection.

Also I see new exceptions being surfaced and some exceptions being removed from SqlBulkCopy constructor. That is another concern.

I don't think there should be any additional changes apart from the fact that it is more convenient to use SqlCommand and SqlBulkCopy without any additional transaction assignments. Am I missing something?

saurabh500 commented 4 years ago

Also can we make this change just about changing SqlCommand to be more convenient to use. I see new surface area being exposed like public SqlTransaction CurrentTransaction. Why ? What are you going to do with this transaction. There is too much going on in your changes, which is going to introduce surface area, which I am not sure if is going to be used.

CodeAngry commented 4 years ago

@saurabh500 You can see the test changes right now in the commit: https://github.com/CodeAngry/SqlClient/compare/fix-issue-28

Only changes are in the following tests:

Maybe I did not express myself clearly. No test is removed. Some lines from them are changed or removed. You can see the diffs. No test removed, just a few lines dropped or edited as they didn't apply anymore.

roji commented 4 years ago

@CodeAngry in general it is better to submit a pull request rather than pointing at a commit - github has great code review functionality on PRs. I also agree with @saurabh500 that changes should be very minimal and one step at a time, i.e. not introduce CurrentTransaction since that isn't directly related to what's being discussed here.

cheenamalhotra commented 4 years ago

@roji I created PR #330 from @CodeAngry 's active branch for issue, for better reviewing.

jhudsoncedaron commented 4 years ago

@Wraith2 : Because as you can see by now PRs are usually not taken.

Wraith2 commented 4 years ago

The majority of my PR's have been merged and where they haven't I understand why. I would like it to take less time to get them merged but they get there. Where is the PR that covers this issue? Perhaps it just needs clarification on why it isn't being accepted.

jhudsoncedaron commented 4 years ago

@Wraith2 : Because it's a breaking change. I'm just waiting for my team to say "enough" and start forking for things that are clearly broken.

CodeAngry commented 4 years ago

Anyone who wants a fix for this is to never use MARS, SqlTransaction, Scoped Transaction and always use SET TRANSACTION ISOLATION LEVEL + BEGIN TRANSACTION + COMMIT/ROLLBACK directly in the SqlConnection. This helps you never have to worry about dragging this useless SqlCommand.Transaction variable around when creating SqlCommands.

If you want just create a IDisposable class that implements the above steps. Makes life much easier since Microsoft doesn't want to, or might do it in 2 decades give or take!

jhudsoncedaron commented 4 years ago

@CodeAngry : Beware; this will haunt you unless you disable connection pooling: https://github.com/dotnet/SqlClient/issues/84

CodeAngry commented 4 years ago

@jhudsoncedaron I always do. MARS too. Thanks!

And... look, a 2016 bug still OPEN. I'm dying here with laughter. Literally measuring time in years/half decades for library bug fixes. It must be nice working at Microsoft. At my work, I actually need to get things done within REASONABLE timeframes.

jhudsoncedaron commented 4 years ago

That bug is actually two decades old. Framework has another viable workaround for it though. But the workaround doesn't seem to work in Core.

roji commented 3 years ago

One important note about the interaction between transactions and MARS. I'm no expert, but setting SqlCommand.Transaction is an explicit opt into allowing two commands to participate in the same transaction when MARS is enabled; it accepts that a rollback may undo both commands. Compare the situation with savepoints - where there is no possible opt-in - described in this article.

I haven't fully explored this so I'm not saying anything definitive; but before removing the requirement to set SqlCommand.Transaction, it's a good idea to get a full understanding of the situation.