dotnet / SqlClient

Microsoft.Data.SqlClient provides database connectivity to SQL Server for .NET applications.
MIT License
846 stars 281 forks source link

SqlClient: Implement optimized version of the new ADO.NET batching API #19

Closed GSPP closed 10 months ago

GSPP commented 6 years ago

Edited by @divega on 2/6/2019

We want to add public and provider independent batching APIs to the base classes in ADO.NET. This has been described in detail at https://github.com/dotnet/corefx/issues/35135.

We plan for the batching API to include a "fallback implementation" that should work for most existing providers that simply support the DbCommand functionality. But the fallback implementation will be sub-optimal (it will just concatenate the contents from CommandText into a single string and execute it.

From now on, we are going to use this customer-reported issue to track implementing a SQL Server-optimized version of the batching APIs.

As this issue originally referred to, SqlClient already contain the general functionality necessary in System.Data.SqlClient.SqlCommandSet, but there is no public API to use it directly and the only way to leverage it is through DataSet and SqlDataAdapter.

Original issue contents

-- Command sets are a SQL Server / TDS facility that lets database clients send multiple batches in one round trip. This can be very beneficial for performance. This facility is currently only being used by the SqlDataAdapter infrastructure. This infrastructure is not being used by modern apps anymore and it exposes only a very limited functionality (not arbitrary commands).

Command sets can help with this:

  1. Send multiple batches in one round trip
  2. Receive results independently
  3. Much enhanced performance for executing many small commands
  4. A clean API

SqlCommandSet should be cleaned up and made public. There is great community interest and value in doing this (link to a well known post by Ayende, includes benchmarks, second post, Ayende's hack to pry the class open).

x

Currently, people often concatenate their SQL commands into one batch. This is problematic:

  1. It can defeat plan caching
  2. It's slower
  3. It's tedious and error prone (especially since you need unique parameter names, and errors non-deterministically stop the batch or they don't, and you can't easily find out which command failed)
  4. It is subject to a total 2100 parameter limit (add by @roji, 2020-06-22)

I believe that command sets have great value. It looks like cleaning them up would not be too hard since they were written in a style that looks like they were meant to be exposed.

I also believe that the .NET community is not sufficiently aware that there is this super fast and clean way of submitting many small statements. Many applications and scenarios could benefit from this. When this is made public there should be a blog post on a Microsoft blog about it. It really is a secret ninja trick at this point. "Do this one special trick and get much better performance... DB consultants hate it."

Entity Framework could send updates likes that. According to the measurements by Ayende (and my own) this would be a very nice performance boost. I believe EF sometimes sends multiple sub-queries for one query in case of includes. This could make use of command sets as well cutting away one round trip for each query.

In the age of cloud network latencies are typically higher. It becomes especially valuable to be able to cut out round-trips.

divega commented 6 years ago

@GSPP thanks for the detailed post. FYI, https://github.com/dotnet/corefx/issues/3688 is about adding command batching APIs in the ADO.NET provider model. I see this issue as a good complement for SQL Server support of whatever API we come up with.

Another relevant article about how command batching works in SqlClient: https://blogs.msdn.microsoft.com/dataaccess/2005/05/19/does-ado-net-update-batching-really-do-something/.

GSPP commented 5 years ago

I'm pulling over some thoughts from another issue:

Regarding error handling: It must be possible for providers to keep executing in case of error and keep incurring errors. SQL Server will happily continue in case of many errors. All those errors must be reported the API caller somehow. It must be possible to tell which command incurred what errors (and multiple per command are possible). SqlException could be extended with new information.

SQL Server commands can also generate info messages. These must be provided as well. The existing event SqlConnection.InfoMessage should handle that but this should be tested to work.

SQL Server can respond in various ways to errors. It can continue executing the statement, abort the statement but continue the batch, abort the batch or even kill the connection. These must be accounted for and tested. I'm not saying there's a specific problem here. Just wanted to make aware of these different cases so that deliberate choices can be made about them.

roji commented 5 years ago

@GSPP it's great to have the various possibilities documented. Just to say that it's not obvious to me that the (default) behavior should be to continue the batch - I'm personally more comfortable with aborting it. But that's indeed a discussion and it may even make sense to allow the user to make a decision here (by exposing some enum on SqlCommandBatch).

GSPP commented 5 years ago

For what it's worth I think the SQL Server error model is an atrocity. The fact that SQL Server tends to keep running the batch but aborting the transaction means that following statements will execute outside of any transaction causing random damage. But other things could happen as well because, as stated, sometimes the batch is aborted. Sometimes the transaction stays open, sometimes not. It's very random.

The model is horribly broken in many ways. It seems best to keep command sets as consistent with the single command model as possible. That's the only sanity we can hope to maintain.

roji commented 5 years ago

The fact that SQL Server tends to keep running the batch but aborting the transaction means that following statements will execute outside of any transaction causing random damage.

That's... nuts...... This will all definitely need to be considered carefully when implementing for SqlClient. By the way, is this what currently happens when batching is invoked via the internal SqlCommandSet, which is used via SqlDataSet?

GSPP commented 5 years ago

I don't know what happens there. My wishlist for this features includes that these cases are tested and documented. When we release the first version of the API any unintended consequences are locked in. There is potential here to lock in really nasty and destabilizing behavior.

My hunch is that it is best to expose exactly the behavior that SQL Server naturally gives. Any artifical corrections are not going to help much and will just eliminate possible use cases and optimizations. In my view this is a power user API that is supposed to expose maximum performance and flexibility.

Regarding random damage, SQL Server introduced the XACT_ABORT setting. It reliably aborts batch and transaction. It is default off but it's a great way to obtain a reliable failure mode. There also is TRY-CATCH in T-SQL. I'm sure the team understands that the existing model is bad but of course it is locked in for all time for compatibility.

I personally avoid transaction and error handling in T-SQL whenever possible. C# is perfectly equipped to do that. But some people are forced to write stored procedures and they have no choice but to include a lot of boiler-plate code into every procedure to obtain sane error semantics. It really is a lot of code duplicated each time.

roji commented 5 years ago

At the point where the new ADO.NET batching API is merged and confirmed, and there's question of implementing it in SqlClient, these questions would have to be revisited in detail...

My hunch is that it is best to expose exactly the behavior that SQL Server naturally gives. Any artifical corrections are not going to help much and will just eliminate possible use cases and optimizations. In my view this is a power user API that is supposed to expose maximum performance and flexibility.

I have no idea what is natural here. One thing I do believe, is that it makes no sense to implement batching in a way that is inherently unreliable (from a transaction and/or error handling point of view). We also have two batching options that are already implemented by SqlClient today: the concatenation-based one in DbCommand and the internal SqlCommandSet exposed via SqlDataSet. The investigation on possible behaviors and guarantees for the new API should probably start there.

GSPP commented 5 years ago

By "naturally" I mean this new API should expose whatever the TDS protocol feature for command sets delivers. Likely, we have no choice over the behavior anyway.

Concatenation based approaches are impossible to make reliable in the general case (for example, CREATE VIEW must be the only statement in the batch). This choice can be discarded immediately. There is little value in that because client code can do that already.

divega commented 5 years ago

As recently announced in the .NET Blog, focus on new SqlClient features an improvements is moving to the new Microsoft.Data.SqlClient package. For this reason, we are moving this issue to the new repo at https://github.com/dotnet/SqlClient. We will still use https://github.com/dotnet/corefx to track issues on other providers like System.Data.Odbc and System.Data.OleDB, and general ADO.NET and .NET data access issues.

bgribaudo commented 5 years ago

@roji [from https://github.com/dotnet/corefx/issues/35135#issuecomment-493700728]

It is indeed the client's responsibility to drain and process responses coming from the server. This is similar to the PostgreSQL case: when responses aren't drained quickly enough the TCP buffer gets full and the server simply blocks. However, this doesn't really seem specific to batching in any way (can easily occur with a single command where row processing is very slow) - or am I mistaken?

I believe you are correct--I'm unaware of any way in which this would be batching specific.

roji commented 5 years ago

Answering @Wraith2's comment in https://github.com/dotnet/corefx/issues/35135#issuecomment-519721600.

Sounds like things are somehow being built not against the newest versions? If you want to point me to a PR/branch I can try to take a look.

In any case, we know that the new batching stuff won't go into .NET Core 3.0 or .NET Standard 2.1. That means it may be OK to just wait for MS.Data.SqlClient to go open-source and work on setting up a benchmarking suite there - I don't think it makes much sense to do this in corefx where System.Data.SqlClient is getting pretty much frozen. It will also probably be easier to work in that repo rather than within corefx, built-wise.

Wraith2 commented 5 years ago

From a clean corefx master. have a corefx command prompt open at the repo root. Create a new branch. Run build src\System.Data.SqlClient and build src\System.Data.Common they will succeed, if It doesn't there are other problems.

Open System.Data.Common project and add in this code to the ref assembly:

    public abstract class DbBatch : IDisposable, IAsyncDisposable
    {
        public void Dispose() { throw null; }
        protected virtual void Dispose(bool disposing) { throw null; }
        public virtual Threading.Tasks.ValueTask DisposeAsync() { throw null; }
    }

Then at the command prompt run build src\System.Data.Common again and it will succeed, you'll get ApiCompat errors but those are just because we didn't add an implementation. The compile has succeeded for everything and they're not the same, all good to continue development at that point.

Open SqlClient project, add this code into the ref assembly:

namespace System.Data.Common
{
    public abstract class DbBatch : IDisposable, IAsyncDisposable
    {
        public void Dispose() { throw null; }
        protected virtual void Dispose(bool disposing) { throw null; }
        public virtual Threading.Tasks.ValueTask DisposeAsync() { throw null; }
    }
}

Then at the command prompt run build rc\System.Data.SqlClient again, and see the following failures:

System.Data.SqlClient.cs(246,50): error CS0246: The type or namespace name 'IAsyncDisposable' could not be found (are you missing a using directive or an assembly reference?) [E:\Programming\csharp7\corefx\src\System.Data.SqlClient\ref\System.Data.SqlClient.csproj]
System.Data.SqlClient.cs(250,40): error CS0234: The type or namespace name 'ValueTask' does not exist in the namespace 'System.Threading.Tasks' (are you missing an assembly reference?) [E:\Programming\csharp7\corefx\src\System.Data.SqlClient\ref\System.Data.SqlClient.csproj]

ValueTask and IAsyncDisposable are both part of core, they're surfaced through System.Runtime type forwards. Both ref projects contain project references to the same project in the src tree for this, and yet one works and the other does not. If you check the .csproj files for both solutions they're almost the same except sqlclient has netfx and netstandard targets, neither of which is the default in the build command (netcoreapp is).

/pokes @ViktorHofer because you usually help me work out how to build things and @jnm2 because we were discussing this in gitter.

divega commented 5 years ago

@Wraith2, just a shot in the dark, but have you tried to add the DbBatch class in System.Data.SqlClient.NetCoreApp.cs as opposed to System.Data.SqlClient.cs? Looking at ref/System.Data.SqlClient.csproj, I believe that System.Data.SqlClient.cs is probably compiled for all targets and in some of them those types won't exist.

ViktorHofer commented 5 years ago

I would try @divega's suggestion and if that doesn't work I will take a closer look.

Wraith2 commented 5 years ago

The prototype I have in corefx requires the RPC changes that had already been merged there. Those changes in this version of the library involved Always Encrypted which means it isn't a simple port and will require some careful review and testing. The changes are made and in PR https://github.com/dotnet/SqlClient/pull/209 . Once those changes are integrated I can bring over the batching api.

Wraith2 commented 4 years ago

The RPC changes I needed for this are in the netcore version of this library now. I still have my old corefx branch with the original version of the feature and I can port it to this library now. There are a couple of questions.

roji commented 4 years ago

@Wraith2 thanks for this update, I'm very happy to hear this is still on your mind. I'm admittedly taken up by other things, but pushing the batching API through is definitely one of my goals this year.

Just to be sure I understand - your RPC changes are what's required under the hood, but the batching surface API (i.e. DbBatch) isn't yet exposed in any way, right? Would you estimate this would be a large effort given that the RPC changes are there?

does Always Encrypted need to be fully supported in the batching api? if so does this require new or SqlClient specific surface area?

I think you're better-placed than I to answer that :) Are you asking because the two features conflict in any way? Why do you think new surface may be required?

does the feature need to be added to both netfx and netcore versions? I've got a netcore version but we haven't progressed with the merging of codebases and getting those changes ported to netfx might be tricky.

That again is more a question for @David-Engel and @cheenamalhotra. From the ADO.NET perspective, the batching API is definitely not going to go into .NET Framework (in general no new BCL APIs will be introduced). It's still possible to introduce the API into the SqlClient netfx version, but I'm not sure it's extremely valuable (unless you see this as an easy and relatively risk-free proposition). Perf-minded people (and adopters of new APIs) in general should definitely be on core, so I'd imagine the value here would be more having general parity between the two versions of SqlClient rather than specific user need for this.

do you want this for the 2.0 timeframe or is ok for a later version, it introduces new api surface area but doesn't break any previous behaviours so it can be a 2.x or later I think.

I think post-2.0 should be fine - especially if you're convinced there's not going to be any breakage. Realistically I'm not going to get around to focusing on this in the next month (possibly two), and the main deadline from my perspective is .NET 5.0.

Wraith2 commented 4 years ago

When I wrote the batching implementation in System.Data.SqlClient it took advantage of some improvements I'd made to the low level RPC mechanism inside the library which made it a lot easier. This repository was forked before my changes were merged so I needed to port over those RPC changes before I could port the batch feature. The netfx version of M.D.SqlClient won't have those changes which would make porting tricky, not impossible but more difficult.

The Always Encrypted feature is implemented largely at RPC level so it adds a layer of complication to my existing code and presents testing and validation problems because I don't have an AE environment setup. I rely on the CI to validate those behaviours and that isn't suitable as a development REPL environment, if AE is needed I'm going to need help from the MS team. It will also introduce some extended properties beyond the shared api because it'll require setting AE keys etc on the batches. I don't see any technical problem with it just resourcing concerns.

your RPC changes are what's required under the hood, but the batching surface API (i.e. DbBatch) isn't yet exposed in any way, right? Would you estimate this would be a large effort given that the RPC changes are there?

Correct. In terms of effort it's a few hours worth of porting from the corefx branch to get the core feature in place. Extra time will be needed on top for AE. My tests are a bit basic so we could expand on those.

We could put the DbBatch and other types into this library and ifdef them depending on the framework target, include out local versions for 3.1 and exclude for 5.0 where they will be in framework System.Data.Common. This would allow us to ship the feature from this repo before 5.0 is ready and allow users to try the api.

roji commented 4 years ago

Thanks for the explanations @Wraith2, great to have you on this. I don't think it's terrible if SqlClient initially supports batching only without AE, although it would limit some usage possibilities: for example, the EF Core SQL Server provider wouldn't be able to use the batching API, since it wouldn't always work. Regarding the technical/resourcing side of this, I really don't have much to contribute - it's up to you and the SqlClient people.

We could put the DbBatch and other types into this library and ifdef them depending on the framework target, include out local versions for 3.1 and exclude for 5.0 where they will be in framework System.Data.Common. This would allow us to ship the feature from this repo before 5.0 is ready and allow users to try the api.

That's certainly possible - I'm definitely intending to expose DbBatch and friends from Npgsql for users targeting pre-5.0 TFMs as well, SqlClient could very well do the same. One advantage of this is to actually implement the API and run into any problems early, hopefully leaving us enough time to tweak the specs before they get locked down for 5.0. The only thing to be careful with is that if the specs do change, that may be a breaking change for users doing this - but I think it really shouldn't be a huge issue (it could be defined as a preview API, and possibly even exposed only in preview versions; to be seen with @David-Engel and @cheenamalhotra obviously).

One thing that would be great to see, though, is some perf numbers on how the API fares and to what extent it improves SqlClient performance (compared to "semicolon batching" in the same command). @bgrainger has done this work for MySqlConnector as you know (https://github.com/mysql-net/MySqlConnector/issues/650), it would be really great to see those numbers for SqlClient as well. I'm hoping it wouldn't be much work to adapt Bradley's code to SqlClient, I'd obviously be happy to help as well if needed.

bgrainger commented 4 years ago

The simple read-only DbBatch benchmarking code for MySQL/MariaDB is at https://gist.github.com/bgrainger/0504a52065eeade9e4b30e88b6363f2c.

Wraith2 commented 4 years ago

I've got the code together and tests passing but I'm fighting a losing battle against the build system and as long as I have errors I can't get a nuget package to perf test with. Can I get some help figuring out what's going wrong with it @cheenamalhotra or anyone else? The branch is at https://github.com/Wraith2/SqlClient/tree/api-batching and from the netstandard 2.0 build ( I think) i'm getting stuff like this:

Common\Data\Common\DbBatchCommand.cs(8,27): error CS0260: Missing partial modifier on declaration of type 'DbBatchCommand'; another partial declaration of this type exists [E:\Programming\csharp7\SqlClient\src\Microsoft.Data.SqlClient\netcore\src\Microsoft.Data.SqlClient.csproj]
Common\Data\Common\DbBatch.cs(15,25): error CS0260: Missing partial modifier on declaration of type 'DbBatchCommandList'; another partial declaration of this type exists [E:\Programming\csharp7\SqlClient\src\Microsoft.Data.SqlClient\netcore\src\Microsoft.Data.SqlClient.csproj]
E:\Programming\csharp7\SqlClient\obj\Release.AnyCPU\Microsoft.Data.SqlClient\netcore\netstandard2.0\Microsoft.Data.SqlClient.notsupported.cs(1161,32): error CS0102: The type 'DbBatchCommand' already contains a definition for 'CommandText'
 [E:\Programming\csharp7\SqlClient\src\Microsoft.Data.SqlClient\netcore\src\Microsoft.Data.SqlClient.csproj]

which is saying there are two definitions of DbBatchCommand which there aren't and references a file Microsoft.Data.SqlClient\netcore\netstandard2.0\Microsoft.Data.SqlClient.notsupported.cs which isn't real and I can't see how it's being generated if it is. Compiling in visual studio works fine. and shows no errors for any config. I've got various conditions in the project files to stub in the provider base definitions for <net5.0 because they're needed as base classes.

Wraith2 commented 4 years ago
Method Mean Error StdDev
Command 106.08 us 1.507 us 1.409 us
Commands 201.79 us 2.500 us 2.216 us
PreparedCommand 92.99 us 0.452 us 0.401 us
PreparedCommands 106.42 us 1.364 us 1.276 us
BatchCommand 112.90 us 1.907 us 1.783 us
BatchCommands 137.51 us 1.737 us 2.197 us
PreparedBatchCommand 110.35 us 1.707 us 1.597 us
PreparedBatchCommands 137.70 us 1.752 us 1.554 us
Wraith2 commented 4 years ago

Those test numbers are the tests that @bgrainger provided. For SqlClient prepare isn't implemented (could be but it would only help with homogenous commands) and the it's only comparing 2 commands against a low latency local server. I've removed the prepare tests and upped the number of queries in the batch to 20 so you can see the per differences more clearly.

Method Mean Error StdDev
Command 217.7 us 3.20 us 3.00 us
Commands 1,933.6 us 12.40 us 11.60 us
BatchCommand 227.4 us 2.10 us 1.86 us
BatchCommands 685.9 us 7.79 us 6.08 us

So the important figures are the Commands and BatchCommands which take a list of 20 statements and execute them serially counting the results.. Batching is 2.5 times faster even at this low repetition count.

These tests don't take into account latency and as I mentioned this is all done against a local instance. With a remote server or even worse an azure server we should see that gulf open even wider.

cheenamalhotra commented 4 years ago

@Wraith2

Can you paste here SQL Server profiler traces to see how commands are sent to server?

roji commented 4 years ago

@Wraith2 if I'm reading this correctly, batching using the new API (BatchCommands) is 3 times slower than batching using today's method, i.e. concatenation (Command)? Can you confirm that the two correspond to the techniques described in this article?

cheenamalhotra commented 4 years ago

To answer your earlier questions, as discussed internally:

does Always Encrypted need to be fully supported in the batching api? if so does this require new or SqlClient specific surface area?

Yes. AE and Enclaves would need to be supported now or later, since it's one of the most important features for stakeholders. I'm not sure I understand surface area? The driver should be able to work with encrypted/non-encrypted columns seamlessly as it does for regular commands. New APIs for only Encrypted columns should be avoided and user experience for encryption enabled columns should be similar to rest of the driver.

does the feature need to be added to both netfx and netcore versions? I've got a netcore version but we haven't progressed with the merging of codebases and getting those changes ported to netfx might be tricky.

Yes. We want to reduce effort later on maintaining and porting over changes to netfx. And our goals are to continue to provide new features/fixes in both NetFx and NetCore as they apply to unify code-bases and to avoid API differences in both drivers as much as possible. However, it can be done after design in netcore is finalized and approved.

cheenamalhotra commented 4 years ago

For what it's worth I think the SQL Server error model is an atrocity. The fact that SQL Server tends to keep running the batch but aborting the transaction means that following statements will execute outside of any transaction causing random damage. But other things could happen as well because, as stated, sometimes the batch is aborted. Sometimes the transaction stays open, sometimes not. It's very random.

The model is horribly broken in many ways. It seems best to keep command sets as consistent with the single command model as possible. That's the only sanity we can hope to maintain.

@GSPP Maybe an example would help determine where this is observed. Because working with transactions and exceptions is definitely tricky and is going to be one of the key things when Batching is implemented.

Wraith2 commented 4 years ago

Can you paste here SQL Server profiler traces to see how commands are sent to server?

basic traces show it's standard mode vs exec sp_execute but that's to do with how the existing driver executes batching not any choice of mine. Batching was already present in limited form I just surfaced it using the new public surface and tweaked a few things. traces.zip If you want specific trace flags or events just let me know what.

AE support is entirely possible but it comes with performance limitations. AE does a server roundtrip to get the encrypted parameter descriptions for each statement you want to execute. So if you want to execute a batch you'll have to get a description for each statement in it which at best doubles roundtrip time. If we put in some logic to detect homogenous batch commands e.g. "INSERT INTO table(value) VALUES (@value)" repeated 20 times with different parameters we could do a single describe roundtrip and use the encryption data for all of the batch. That's all SqlClient internal detail though

Wraith2 commented 4 years ago

does the feature need to be added to both netfx and netcore versions?

Yes. We want to reduce effort later on maintaining and porting over changes to netfx

The branch I got these numbers from implemented the SqlClient and System.Data.Common parts that would need to be in the runtime so it works in netcore 2.1 and 3 builds and is just project changes for 5.0 if this api gets in.

The internal changes mostly touch basic RPC which is slightly different in netcore due to perf changes I've made but those changes have been live for quite a while now in corefx and this library so porting them to netfx shouldn't be too difficult.

Wraith2 commented 4 years ago

@Wraith2 if I'm reading this correctly, batching using the new API (BatchCommands) is 3 times slower than batching using today's method, i.e. concatenation (Command)? Can you confirm that the two correspond to the techniques described in this article?

Yup. That's what it looks like. Concatenation is fine in small cases like this but it's going to seriously screw with your query plan generation and lifetime as well as parameter sniffing and time spent in plan generation in any moderately loaded environment. Concat is fine for small stuff but larger scale is where this feature is useful.

roji commented 4 years ago

Absolutely agree that query plan pollution is important here, thanks.

cheenamalhotra commented 4 years ago

If we put in some logic to detect homogenous batch commands e.g. "INSERT INTO table(value) VALUES (@value)" repeated 20 times with different parameters we could do a single describe roundtrip and use the encryption data for all of the batch.

This might not be possible. JDBC driver supports 'ExecuteBatch()' for parameterized queries and it's only possible with 1 query at a time. You can add data to batch, but internally it's still executed for 1 query at a time using sp_executesql. And for AE, metadata differs based on DATA, so if you are passing "32.5" and "12345.123" in 2 different queries in a batch, their metadata is different when encryption is performed.

cheenamalhotra commented 4 years ago

On the other note, parameter metadata caching is beneficial for some scenarios where re-use of metadata is done when applicable on batched queries, which we also support implicitly in SqlClient.

cheenamalhotra commented 4 years ago

@Wraith2

I compared your traces to parameterized queries, and that's exactly the same (using sp_executesql). So I'd like to know how batching will improve above parameterized execution (from SqlClient perspective), coz we cannot concatenate queries, they will always be sequentially executed.

Secondly, since Prepare is not supported, I think that's a major lag. We're totally missing a big block of prepared statement support in the driver. I will investigate more about it, but let me know if you have any details.

IMO It should be supported for improving query execution performance on regular statements as well as parameterized queries. When preparing and executing queries, driver would make RPC calls in below order for n rounds of execution: (1)sp_executesql > (1)sp_prepareexec > (n-2)sp_execute. [sp_execute being the most efficient one for repetitive execution, and sp_executesql saving prepare time on server side if query is executed only once]

That's how JDBC works, and it gives a good performance impact.

If we only want benefit of sp_executesql, prepared statements is what we should be looking into first.

Wraith2 commented 4 years ago

It can improve performance by sending multiple RPC's at the same time instead waiting on the first roundrip to complete. it helps in large latency situations. I haven't implemented prepare because it's a proof of concept not production ready code. Given that there was no signoff on the feature I wasn't trying to make it complete just see if it was possible, and it is.

cheenamalhotra commented 4 years ago

It can improve performance by sending multiple RPC's at the same time instead waiting on the first roundrip to complete.

We'd still need to parse the results for each statement, as exceptions also need to be caught and thrown immediately instead of letting driver read all results. It could lead to unnecessary server processing if 1000s of statements are batched and the first one throws error.. we can't wait for server to process all 1000s of queries and then read first error. As SQL Server will keep pushing results for all queries to stream and would not stop processing if you requested that. It may also mean unwanted query executions if first one threw error, leading to unexpected operations.

Wraith2 commented 4 years ago

On the first exception processed I would assume you send a cancel and then wait for the ok response while discarding contents. I also regard exceptions as something which should be exceptional and don't try to make them fast only correct, if a user screws up it's on them to deal with the outcome of that I only promise I won't make it worse.

roji commented 4 years ago

Note that @GSPP seems to suggest above there may be an option to make SQL Server abort a batch on the first error (https://github.com/dotnet/SqlClient/issues/19#issuecomment-461451286), though I'm not sure if he meant the concatenated batch or our new proposed RPC batch. FWIW that's exactly the Npgsql/PostgreSQL behavior - the first failed command will automatically cause PG to skip the remaining ones, and only the first exception will be reported.

Wraith2 commented 4 years ago

probably SET XACT_ABORT on or something like that but the driver can't rely on the user doing that or force them to do so because it's a behaviour changing setting. The batch command has been added to the exception as part of the new api surface so users can identify the batch entry that failed. We're not really talking at that level here though @cheenamalhotra was talking about the internal behaviour of the rpc's that are issued to the server and how their results are handled. It sounds like the java driver does something that might be useful here so lets see if it helps.

roji commented 4 years ago

It may be OK for the driver itself to change a setting like that - as this is a totally new API there's no backwards compat or anything (as long as things are set back properly at the end etc.).

GSPP commented 4 years ago

XACT_ABORT has strong semantic consequences. The driver should not change that. For both values of that setting, there's code "in the wild" relying on it. In particular, changing the default just for the new batch API would add to the confusion around the error model by introducing another special case.

Likely, the batch API should just behave as if those commands were executed one after the other. I think this is the naturally occurring behavior and it introduces the least additional confusion.


I do not have test cases for the various error cases (aborting batch or transaction) at hand. This seems to provide some pointers: https://dba.stackexchange.com/questions/126467/list-of-batch-aborting-errors-in-sql-server As described in that link, sometimes a batch keeps limping along and sometimes it is terminated. Reading that list it seems that batches are aborted under fairly unusual circumstances only. But there are exceptions:

As I suspected, there are exceptions, of course, as noted in the comments. Conversion failure is severity 16 but aborts the batch

This means that a SQL command can produce a stream of errors and other info messages interleaved with results. I believe this is currently fully supported through the InfoMessage event. I suspect that SQL Server Management Studio simply uses this API to support that as well.

I find statements on the web that errors do not roll back the transaction but keep executing. I don't trust those statements because I have definitely seen a transaction rolled back but the batch continuing. Needless to say, this had damaged the database by partially applying changes in a fairly random fashion.


Command sets are nice because they preserve the existing semantics while improving performance. Other solutions such as concatenating SQL lack that clean abstraction. For that reason, command sets should behave as closely to normal commands as possible.

This is important for library authors as well. For example, the popular Dapper library might use command sets (somehow) in the future. Dapper has no control over the SQL that is being passed to it.

roji commented 4 years ago

Sorry for the long text below...

XACT_ABORT has strong semantic consequences. The driver should not change that. For both values of that setting, there's code "in the wild" relying on it. In particular, changing the default just for the new batch API would add to the confusion around the error model by introducing another special case. [...] command sets should behave as closely to normal commands as possible.

It does seem valid to me for a (new) batching API to have its own error semantics which differ from normal use. The reason for that, is that when commands are executed in separate roundtrips, the application has a chance to react to errors, e.g. refrain from sending later commands and/or go down a different code path.

There is also a performance aspect here which I think we're overlooking. @Wraith2 wrote:

I also regard exceptions as something which should be exceptional and don't try to make them fast only correct

Database exceptions are sometimes a normal part of program flow. For example, imagine doing optimistic concurrency (some update) as your first command in a batch - do you really want to have to execute and consume the results of all subsequent commands when it fails? Granted, the whole idea of optimistic concurrency is that concurrent updates to the same row are rare, and that it's acceptable to pay a small perf price when they happen (e.g. the extra roundtrips to redo the update) - but the price of executing later batched commands can easily become prohibitive if the batch is big.

However, I'm not sure if setting XACT_ABORT to true helps - it fails the transaction, but doesn't necessarily imply that later commands are evaluated (and their results returned). This would have to be checked.

To conclude, I personally find the SQL Server default behavior (XACT_ABORT=false) quite bad, but that may be my PostgreSQL background. However, @GSPP I do see the point that this is the SQL Server (default) behavior, and that we should be careful when considering changing it (although again, the context of batching may justify it). It seems that more research on exactly how SQL Server behaves would be good here.

PS1 Just for comparison, in PostgreSQL, the moment a command fails, the entire transaction is put into a "failed" state. It's not possible to continue executing commands (they will immediately fail), and the only thing you can do is roll the transaction back. This is completely unrelated to batching. PS2 At the wire protocol level (e.g. like TDS), once an error occurs, PostgreSQL will skip all subsequent protocol messages until a special Sync message is seen. Npgsql uses this feature and packs batch commands together within a single Sync, so later commands are automatically skipped (not executed, or even parsed). This isn't related to transactionality.

Wraith2 commented 4 years ago

I did a quick test. I changed the list of commands so that the 3rd was one which would cause a common error. Specifically I tried to convert a string into an integer that was never going to work. Profile trace is as follows.

Capture

So it looks to me as if regardless of whether the following 17 RPC's were sent they weren't executed and I got an sql exception in the calling application. Does that help?

As far as the SQL Server error model being confusing and complicated. I agree but there's absolutely no way it's going to change because of the massive amount of work and errors it would cause to existing users. It's something that has to be lived with and understood for compatibility. It's like VB6, I don't like it but I do have to be aware of it.

cheenamalhotra commented 4 years ago

@Wraith2 currently SqlCommandSet's behaviour does not send them in single request, so that's a safe implementation. I don't think we have any implementation in driver to send multiple commands at once and read all responses together. The only thing that's different from regular SqlCommand that you can see is executing queries with RPC call that boosts performance.

This performance can further be improved if we supported Prepare that prepares queries and re executes them using it's prepare handle.

What I meant to say was we aren't going to do anything extra by making this class public than we could if we implement support for Prepare properly. This class would rather be an overhead to maintain once Prepare is supported. We can support same behaviour in SqlCommand in order to execute with RPC (as I mentioned above as done by JDBC) if user can call sqlCmd.Prepare() before executing it.

But "sending all requests together and reading all responses" is not going to work for SQL server.

Also regarding XACT_ABORT discussion, we don't change server side properties implicitly outside the scope of a single statement as changing anything on the database/connection without customer requesting it is against integrity of the driver.

Wraith2 commented 4 years ago

currently SqlCommandSet's behaviour does not send them in single request

It doesn't? The code at https://github.com/dotnet/SqlClient/blob/5e943028d702cdc948263f96d60a5677860577f4/src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/TdsParser.cs#L8943 loops over the entire rpc batch sending it in multiple packets if required without running the receive loop as far as I can see.

cheenamalhotra commented 4 years ago

Ok I see that's an RPC batch being sent. Thanks for sharing, I didn't scan through the code completely, but was comparing traces. (I've just recently engaged in this proposal 🙂)

RPC batching is good! Let's also evaluate transaction processing in this flow and if we see any unexpected behaviors. We would definitely need a lot of testing.

roji commented 4 years ago

Just as a reminder, the most urgent from my perspective is less to work out all the implementation issues now, and more to confirm that the general ADO.NET batching proposal in #28633 is good.

(As an aside, @Wraith2 I hope you've received my email)

Wraith2 commented 4 years ago

I did and responded, ready to talk it through with you all.

Wraith2 commented 4 years ago

The branch is here. https://github.com/Wraith2/SqlClient/tree/api-batching The implementation is mostly moving the internal LocalCommand object out to SqlBatchCommand and plumbing exception and sqlbatch through call hierarchies. Work will need to be done to get encrypted scenarios working and additional work could be done for prepared commands if there were time and interest.

Wraith2 commented 3 years ago

Looks like this is going to get into NET6. I've still got my branch with the implementation in so I can rebase to current and get it posted as a draft.

Work that I know needs to be done: 1) NET6 build target needs to be added 2) Encryption scenarios need supporting and tests writing 3) Decisions on backporting to <NET6 versions (possible but is it desired?)