Closed Chrriis closed 2 years ago
Hi @Chrriis, Thanks for creating the issue. We will investigate and get back to you.
Hi @Chrriis,
I wanted to share a few details in case they are helpful for your investigations. My initial testing shows that the behavior is query specific. I ran the exact same test with a simpler update query on numeric
columns (update numerictable set column2=? where column1 = ?
) and both drivers took almost exactly same amount of time to execute.
I also modified your test code a bit and seeing some strange behavior. When I hardcode the ID in the query
UPDATE T_DbPerfTest SET Col29 = x.Col29 FROM ( SELECT Col1, LEAD(Col28) OVER (PARTITION BY Col5, Col2, Col4, Col3 ORDER BY Col28, Col1) AS Col29 FROM T_DbPerfTest WHERE Col29 IS NULL AND (Col27 = 1 OR Col28 = ?) AND Col2 = 240254 ) x WHERE T_DbPerfTest.Col1 = x.Col1 AND T_DbPerfTest.Col29 IS NULL AND x.Col29 = ? AND T_DbPerfTest.Col27 = 1
and update the JAVA code accordingly
public static void executePreparedStatements(Connection connection, PreparedStatement preparedStatement) throws Exception {
long[] ids = {
240254,
684796,
684786,
240102,
240274,
240303,
240332,
240244,
685302,
685331,
685196,
685389,
685228,
685360,
};
for(long id: ids) {
preparedStatement.setLong(1, 27586214);
preparedStatement.setLong(2, 27586214);
preparedStatement.addBatch();
}
preparedStatement.executeBatch();
}
jTDS driver becomes slower too, both drivers perform the same. So parametrizing IDs makes jTDS faster and I am currently not sure why. The only difference between drivers is jTDS calls sp_prepare
whereas Microsoft JDBC driver calls sp_prepexec
for PreparedStatemets. I will try to analyze this behavior sometime next week, please share your findings too.
Thank you very much for the update!
Maybe I should tell more about this schema. This schema contains a primary key (Col1), a conceptual key (Col2, Col3, Col4, Col5) with a versioning column (Col28). A version can contain many rows of different conceptual keys. Primary keys are (generally) an increasing number, and version column is an increasing number. Rows were added in the order of the versioning column, using statements that declare the keys in the same order. The purpose of the test statement is to set the value of Col29 to the value of Col28 of the next version (per its conceptual key).
I am not a specialist in statistics computing in databases, but it is possible that the layout after insertion helps running the update statement (only the most recent versions are of interest, so last inserted rows per conceptual keys / PK) as done by JTDS.
You say that the only difference is in using sp_prepexec instead of sp_prepare. But in the test case, we clearly prepare the statement before executing it multiple times. Do you mean that sp_prepexec is invoked at the first batch execution (and sp_execute after that) or used at each batch execution (what would it mean to have a prepare step everytime)? Have you tried modifying the MS code to use sp_prepare like JTDS to see if you get the same result than JTDS (so we are sure there is nothing else at play here, like a subtle difference in the use of the TDS protocol, data types, etc.)?
I will do a few experiments tomorrow, I will let you know if I find anything of interest.
I tried another variation of the statement which is less performant but more basic SQL (no window function).
UPDATE T_DbPerfTest SET Col29 = ? WHERE Col2 = ? AND Col27 = 1 AND Col29 IS NULL AND EXISTS ( SELECT 1 FROM T_DbPerfTest b WHERE T_DbPerfTest.Col5 = b.Col5 AND T_DbPerfTest.Col2 = b.Col2 AND T_DbPerfTest.Col4 = b.Col4 AND T_DbPerfTest.Col3 = b.Col3 AND ( T_DbPerfTest.Col28 < b.Col28 OR T_DbPerfTest.Col28 = b.Col28 AND T_DbPerfTest.Col1 < b.Col1 ))
You can try it yourself, just replace the statement in the test case, and comment the "setLong(3, xxx)" line.
Instead of times like "6020 (MS) vs 1512 (JTDS)", I get "14013 (MS) vs 5667 (JTDS)". Indeed a slower statement, but still similar slowdown ratio between MS and JTDS.
I tried changing various settings in the connection string of SQL Server, but no improvement:
responseBuffering=full
enablePrepareOnFirstPreparedStatementCall=true
sendTimeAsDatetime=false
disableStatementPooling=false
statementPoolingCacheSize=10
serverPreparedStatementDiscardThreshold=0
I also tried to change JTDS parameters to see how the driver behaves. Still fast when changing:
TDS=7.0
maxStatements=0
JTDS became as slow as the MS driver when changing:
prepareSQL=2
I was not surprised by that last change result. The JTDS documentation says:
2: sp_executesql is used (fast)
3: sp_prepare and sp_cursorprepare are used in conjunction with sp_execute and sp_cursorexecute (faster, SQL Server only)
The test case uses a batch but I found this is irrelevant. If you replace preparedStatement.addBatch();
with preparedStatement.execute();
and comment preparedStatement.executeBatch();
you get the same speed problem.
So parametrizing IDs makes jTDS faster and I am currently not sure why.
The funny thing is that if you hardcode the 1st and 3rd parameters in the query (only leaving the ID parameterized), then the query is still fast for JTDS and slow for MS.
Hi @Chrriis ,
Do you mean that sp_prepexec is invoked at the first batch execution (and sp_execute after that) or used at each batch execution.
sp_prepexe
is executed only once when statementPoolingCacheSize=10;disableStatementPooling=false;enablePrepareOnFirstPreparedStatementCall=true;
. See SQL Profiler outputs below(I am hardcoding 1st and 3rd parameters):
MS JDBC
declare @p1 int
set @p1=1
exec sp_prepexec @p1 output,N'@P0 bigint',N'UPDATE T_DbPerfTest SET Col29 = x.Col29 FROM ( SELECT Col1, LEAD(Col28) OVER (PARTITION BY Col5, Col2, Col4, Col3 ORDER BY Col28, Col1) AS Col29 FROM T_DbPerfTest WHERE Col29 IS NULL AND (Col27 = 1 OR Col28 = 27586214) AND Col2 = @P0 ) x WHERE T_DbPerfTest.Col1 = x.Col1 AND T_DbPerfTest.Col29 IS NULL AND x.Col29 = 27586214 AND T_DbPerfTest.Col27 = 1 ',240254
select @p1
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
jTDS
declare @p1 int
set @p1=1
exec sp_prepare @p1 output,N'@P0 bigint',N'UPDATE T_DbPerfTest SET Col29 = x.Col29 FROM ( SELECT Col1, LEAD(Col28) OVER (PARTITION BY Col5, Col2, Col4, Col3 ORDER BY Col28, Col1) AS Col29 FROM T_DbPerfTest WHERE Col29 IS NULL AND (Col27 = 1 OR Col28 = 27586214) AND Col2 = @P0 ) x WHERE T_DbPerfTest.Col1 = x.Col1 AND T_DbPerfTest.Col29 IS NULL AND x.Col29 = 27586214 AND T_DbPerfTest.Col27 = 1',1
select @p1
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
exec sp_execute 1,240254
exec sp_execute 1,684796
exec sp_execute 1,684786
exec sp_execute 1,240102
exec sp_execute 1,240274
exec sp_execute 1,240303
exec sp_execute 1,240332
exec sp_execute 1,240244
exec sp_execute 1,685302
exec sp_execute 1,685331
exec sp_execute 1,685196
exec sp_execute 1,685389
exec sp_execute 1,685228
exec sp_execute 1,685360
@ulvii I looked at the code and figured that out :)
That being said, I disagree when you say that it is the only thing that differs. I simplified the test case and added logs of the bytes that are sent for a given execution of the statement. I see less bytes sent with JTDS although we should be under the same conditions in both cases: statement was prepared long time ago, there were the same number of executions, etc.
Of course it is possible that I made a mistake in my test case, so please do cross-check my findings!
The zip MSvsJTDS2.zip contains:
It seems that the JTDS prepared statement executions do not have the header (size 22). Maybe only the preparation packet needs it and adding it to executions creates problems?
Thanks a lot for your help!
I think I succeeded modifying the code so that the sp_prepexec call is just a sp_prepare instead. That way we can compare the same scenario between MS and JTDS. Well, unless I made a mistake, total time has not changed: MS is still 3x slower... So, if the above is correct (please confirm!), then it means the bad performance could be due to the way the connection is established (maybe a different setting?) or the version of the protocol. @ulvii What is your take?
I've done some profiling that may or may not be interesting for this case (I'm attaching it.)
Here's what I see.
It looks like the number of requests are roughly the same between the two drivers, yet the MS driver requests appear to take a much longer time.
The MS driver is spending all of it's time here
void SqlServerJdbcDriverSlowdownTest.main(String[]):52 1
void SqlServerJdbcDriverSlowdownTest.executePreparedStatements(Connection, PreparedStatement):102 1
int[] com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.executeBatch():2076 1
void com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(TDSCommand):223 1
void com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(TDSCommand):248 1
boolean com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(TDSCommand):2979 1
boolean com.microsoft.sqlserver.jdbc.TDSCommand.execute(TDSWriter, TDSReader):7194 1
boolean com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement$PrepStmtBatchExecCmd.doExecute():2675 1
void com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.doExecutePreparedStatementBatch(SQLServerPreparedStatement$PrepStmtBatchExecCmd):2786 1
TDSReader com.microsoft.sqlserver.jdbc.TDSCommand.startResponse(boolean):7579 1
boolean com.microsoft.sqlserver.jdbc.TDSReader.readPacket():6418 1
int com.microsoft.sqlserver.jdbc.TDSChannel.read(byte[], int, int):2023 1
int java.net.Socket$SocketInputStream.read(byte[], int, int):68 1
On the flip side, the Jtds path looks as follows for where most of it's time is spent
void SqlServerJdbcDriverSlowdownTest.main(String[]):68 8
void SqlServerJdbcDriverSlowdownTest.executePreparedStatements(Connection, PreparedStatement):102 8
int[] net.sourceforge.jtds.jdbc.JtdsStatement.executeBatch():1051 8
SQLException net.sourceforge.jtds.jdbc.JtdsPreparedStatement.executeMSBatch(int, int, ArrayList):463 8
void net.sourceforge.jtds.jdbc.TdsCore.executeSQL(String, String, ParamInfo[], boolean, int, int, int, boolean):1086 8
void net.sourceforge.jtds.jdbc.TdsCore.wait(int):4127 8
int net.sourceforge.jtds.jdbc.ResponseStream.peek():99 8
int net.sourceforge.jtds.jdbc.ResponseStream.read():114 8
void net.sourceforge.jtds.jdbc.ResponseStream.getPacket():477 8
byte[] net.sourceforge.jtds.jdbc.SharedSocket.getNetPacket(SharedSocket$VirtualSocket, byte[]):731 8
byte[] net.sourceforge.jtds.jdbc.SharedSocket.readPacket(byte[]):850 8
void java.io.DataInputStream.readFully(byte[]):170 8
void java.io.DataInputStream.readFully(byte[], int, int):200 8
int java.net.Socket$SocketInputStream.read(byte[], int, int):68 8
This was using JTDs 1.3.1 and MS 7.4.1
I ran sql server locally in a docker container.
As far as I can tell, it looks like both are doing essentially the same thing. Some protocol or connection setting seems to be the likely culprit here, IMO.
Interestingly enough, in both profiling sessions I did (One with jtds first, one with it second, to make sure that DB caching wasn't messing with things), it looks like the first 2 batches for the MS driver ran faster than JTDs (~80ms). It was the subsequent requests that ended up taking ~1 second.
JTDS consistently spent around 160ms per request.
Is there some "expected batch size" or "look ahead" parameter going on that the MS driver isn't sending? sql_server_performance.zip
@Chrriis Given your request information and the TDS docs
It looks to me (and I could be wrong) that the MS driver is setting the "SQLBatch" packet data section and JTDs is not.
Thanks @cogman. I analyzed the bytes that were sent vs TDS spec and suspected they were the MARS section stuff, so thank you for the confirmation. I think JTDS uses an older version of the protocol that does not have those.
About profiling, I too found that time was spent reading, which is why I started analyzing the sent messages in the first place, but they seemed to be the same.
So I think you are right saying there is a protocol or connection setting, but I fail to find which could be causing this huge difference. Any help is welcome there! :)
@ulvii I found the issue! The root cause is that the statement that is passed to the server is slightly different between MS and JTDS: the MS driver adds spaces at the end of the statement when replacing the ?
parameters with @Px
ones. As a quick test, you can simply add .trim()
at the end of the SQLServerConnection.replaceParameterMarkers(xxx)
method and MS becomes as fast as JTDS using my test cases.
Of course, the fix must be more subtle than this: the statement that is passed to the driver should be left untouched when replacing the parameters. What I mean is that
<space><space>UPDATE FOO SET x=? WHERE y=?<space>
<space><space>UPDATE FOO SET x=@P0 WHERE y=@P1<space>
<space><space>UPDATE FOO SET x=@P0 WHERE y=@P1<space><space><space><space><space><space><space>
With that fix, the MS driver would do as is told (passing command as is) and for us it would become as fast as JTDS (at least for this case).
Please let us know what you think! :)
The fix in SQLServerConnection.replaceParameterMarkers(xxx)
is simply to:
return new String(sqlDst, 0, dstBegin);
That way, the statement is untouched, spaces that were part of the original statement are preserved and no new spaces are added.
TBH, this method is a fair bit more complex than needs be. (Looks like a C->Java translation)
Make this into a string builder and use strings/Integer.toString and this will be both more idiomatic and readable. The performance hit will be negligible, potentially non-existent due to JVM optimizations.
For example, "makeParameterName" could be simplified to "@P" + nParam
Also, weird that spaces have that big of an impact on performance from the server. I'd never have guessed that.
Same method, more idiomatic
static String replaceParamMarkers(String sqlSrc, Parameter[] params, boolean isReturnValueSyntax) {
if (params == null || params.length == 0)
return sqlSrc;
char[] chars = sqlSrc.toCharArray();
StringBuilder results = new StringBuilder(sqlSrc.length() + (6 * params.length));
int param = 0;
for (char c : chars) {
if (c == '?') {
results.append("@P").append(param);
if (!isReturnValueSyntax && params[param].isOutput()) {
results.append(" OUT");
}
++param;
}
else {
results.append(c);
}
}
return results.toString();
}
Hi @Chrriis , @cogman , I tried your suggestions, there is some performance improvement(2 seconds), but in my environment the performance of two drivers is still not comparable for this test case. Total time (ms): 7328 (MS) Total time (ms): 1105 (JTDS) I will continue testing.
Hi @ulvii,
I am not sure what you did, because on our systems, the problem and the fix are consistent.
Here are our results:
Original statement (batch, 3 parameters)
Original statement with inlined 1st and 3rd parameters (batch)
More basic but slower statement (batch, 2 parameters)
Simpler test case (no batch, 1 parameter)
The only change between both is that:
while (dstBegin < sqlDst.length) {
sqlDst[dstBegin++] = ' ';
}
return new String(sqlDst);
is replaced with:
return new String(sqlDst, 0, dstBegin);
I downloaded your test case and updated the driver with your suggestion. I am still willing to create a PR if it resolves the issue for you. Would you implement @cogman's change and let me know if you are still seeing the performance improvement?
Thanks @ulvii, I would greatly appreciate it if you could updated the driver with the changes I suggested. It does make a huge difference!
Of course, what would be great for the future is if SQL Server itself were fixed to treat statements with and without trailing spaces in the same manner. If you know anyone at Microsoft who can do something in the server itself, you would make the world a better place :) Here are some posts which might be related: https://stackoverflow.com/questions/56334818/sql-server-performance-slow-when-query-text-has-a-particular-number-of-character https://stackoverflow.com/questions/12019589/last-4-whitespaces-of-sql-would-hurt-sql-server-performance
About the changes @cogman proposed, I would have to try them when time allows. That being said, I think the immediate need is a fix that is the least intrusive, and I think improving the code as suggested deserves its own bug report / PR to better evaluate if it is equivalent and does not breaks anything (for example, I am not sure how makeParamName(xx)
currently works in detail).
@ulvii, @cogman, thanks a lot for your help!
@Chrriis I've ran the test a few times with a few permutations of the driver changes (mine, yours, part of yours, part of mine). I didn't see any driver differences even though the string output at the end was the same.
UPDATE T_DbPerfTest SET Col29 = x.Col29 FROM ( SELECT Col1, LEAD(Col28) OVER (PARTITION BY Col5, Col2, Col4, Col3 ORDER BY Col28, Col1) AS Col29 FROM T_DbPerfTest WHERE Col29 IS NULL AND (Col27 = 1 OR Col28 = @P0) AND Col2 = @P1 ) x WHERE T_DbPerfTest.Col1 = x.Col1 AND T_DbPerfTest.Col29 IS NULL AND x.Col29 = @P2 AND T_DbPerfTest.Col27 = 1
I even tried running against sql server 2017 vs 2019 just to see if it was something about the database version. I threw in print statements into the driver to make sure I wasn't somehow getting an older version of the driver.
I ran this change with 8.1.1 built on my box.
Edit: Just for fun, I also added a bit of logic to strip out repeated whitespace in the query. No difference.
Testing Prepared Statements...
Total time (ms): 3828 (JTDS)
"UPDATE T_DbPerfTest SET Col29 = x.Col29 FROM ( SELECT Col1, LEAD(Col28) OVER (PARTITION BY Col5, Col2, Col4, Col3 ORDER BY Col28, Col1) AS Col29 FROM T_DbPerfTest WHERE Col29 IS NULL AND (Col27 = 1 OR Col28 = @P0) AND Col2 = @P1 ) x WHERE T_DbPerfTest.Col1 = x.Col1 AND T_DbPerfTest.Col29 IS NULL AND x.Col29 = @P2 AND T_DbPerfTest.Col27 = 1"
Total time (ms): 19405 (MS)
@ulvii I looked at @cogman's code. I changed a few things, e.g. to not allocate an array of chars just to get the length and to take into account the OUT parameter length. I truly think that rewriting this method should be a different issue/PR focused on improving the code and which could have its own dedicated thread of comments (let's not highjack this one which is about a bug in performance).
By-the-way, I did not have the bug today but I restored the database backup and could reproduce. I probably had background processes altering statistics. It is important to restore the backup before measuring again.
So, I would be grateful if you could fix the issue by replacing:
while (dstBegin < sqlDst.length) {
sqlDst[dstBegin++] = ' ';
}
return new String(sqlDst);
with:
return new String(sqlDst, 0, dstBegin);
I hope this can make it soon in a release so we can continue our performance investigations between JTDS and MS drivers.
@Chrriis Did you miss where I ran your code and did not see performance improvements?
Specifically, the statement that got pushed out was UPDATE T_DbPerfTest SET Col29 = x.Col29 FROM ( SELECT Col1, LEAD(Col28) OVER (PARTITION BY Col5, Col2, Col4, Col3 ORDER BY Col28, Col1) AS Col29 FROM T_DbPerfTest WHERE Col29 IS NULL AND (Col27 = 1 OR Col28 = @P0) AND Col2 = @P1 ) x WHERE T_DbPerfTest.Col1 = x.Col1 AND T_DbPerfTest.Col29 IS NULL AND x.Col29 = @P2 AND T_DbPerfTest.Col27 = 1
I didn't just test my code change. Neither change on 8.1.1 resulted in a difference in performance.
I can cook up a test docker compose to demonstrate if you like.
@cogman This bug is related to SQL Server caches of query plans, parameter sniffing, statistics computing, etc. It is known that having trailing spaces, sometimes magical number of them, confuse SQL Server and make it not reuse an optimal plan. I spent too much time on this already. I gave a test case, I gave links to posts of some people with the same problem, I showed that the driver was adding useless spaces which can lead to this problem, and I gave metrics with and without these spaces.
Do you mean that you are not able to reproduce the issue? If so, this kind of issue is like concurrency issues: it is not because you do not see it that it does not exist. In any case, the statement does have useless spaces and it is known to cause problems.
@ulvii, @cogman, I have very interesting findings. I ported the whole thing to SQL Server Express 2012 (our earlier investigations were on SQL Server 2017). And these are the times we get (without fix): Total time (ms): 969 (JTDS) Total time (ms): 10252 (MS)
So, on that server, the performance penalty is 10 times!
But that is not all. If I apply the fix I suggested, the performance penalty remains. This might explain what @cogman is experiencing. So, I applied another difference I found between JTDS and MS JDBC: spaces around parameters. Depending on the test case, it does or does not help. Total time (ms): 969 (JTDS) Total time (ms): 985 (MS)
I was wondering why JTDS was adding these spaces, but maybe they (sometimes?) help parameter sniffing and query plan reuse in older versions of SQL Server, who knows. I would not urge to add the spaces around parameters like JTDS does though, this needs much more analysis.
So: the trailing spaces at the end of the statement are definitely a problem and this fix should be applied. But, removing them does not fix all of the performance penalties of MS driver compared to JTDS.
Here is the backup for SQL Express 2012: MSvsJTDS_SQLExpress2012.zip
I found how to fix the remaining performance penaly found on SQL Server 2012: I changed the MS driver code (temporary hack) to do a prepare instead of a prepExec. Then it becomes as fast...
So, we can infer that:
(of course, do check on your side if you come to the same conclusions).
Hi @Chrriis , @cogman , Thanks for the investigations, interesting results. I created a #1215 for now, since we all agree that trailing spaces should be removed. I will get back to your investigations when time permits and perhaps get SQL Server team involved to figure out the root cause of the performance penalty.
Thanks @ulvii, I will try the next official release that contains this first fix. Unfortunately, we need to support SQL Server 2012 so we will stick to JTDS in production for the time being. That being said, if you need help with testing code changes or carry some analyses, do contact me.
About the PrepExec vs Prepare + Execute, JTDS has the notion of modes using the prepareSQL connection property. Here is its documentation:
prepareSQL (default - 3 for SQL Server, 1 for Sybase) This parameter specifies the mechanism used for Prepared Statements.
Value | Description |
---|---|
0 | SQL is sent to the server each time without any preparation, literals are inserted in the SQL (slower) |
1 | Temporary stored procedures are created for each unique SQL statement and parameter combination (faster) |
2 | sp_executesql is used (fast) |
3 | sp_prepare and sp_cursorprepare are used in conjunction with sp_execute and sp_cursorexecute (faster, SQL Server only) |
While I don't think the MS driver should have these modes, maybe you could add some modes at least to switch between PrepExec and Prepare + Execute.
Hi @Chrriis ,
I am investigating the performance issue against 2k12. Would you attach the code that uses sp_prepare instead of sp_prepexec in the driver?
Hi @ulvii ,
The code to use sp_prepare is a hack: so the first call will only do a prepare and no execution (but the test statement is a no-op, so it is not a problem) and I generally ad one more iteration in the tests to compensate. The change in code is to replace the content of the method SQLServerPreparedStatement.buildPrepExecParams(TDSWriter)
with:
if (getStatementLogger().isLoggable(java.util.logging.Level.FINE))
getStatementLogger().fine(toString() + ": calling sp_prepexec: PreparedHandle:"
+ getPreparedStatementHandle() + ", SQL:" + preparedSQL);
expectPrepStmtHandle = true;
executedSqlDirectly = false;
expectCursorOutParams = false;
outParamIndexAdjustment = 3;
tdsWriter.writeShort((short) 0xFFFF); // procedure name length -> use ProcIDs
tdsWriter.writeShort(TDS.PROCID_SP_PREPARE);
tdsWriter.writeByte((byte) 0); // RPC procedure option 1
tdsWriter.writeByte((byte) 0); // RPC procedure option 2
tdsWriter.sendEnclavePackage(preparedSQL, enclaveCEKs);
// <prepared handle>
// IN (reprepare): Old handle to unprepare before repreparing
// OUT: The newly prepared handle
tdsWriter.writeRPCInt(null, getPreparedStatementHandle(), true);
resetPrepStmtHandle(false);
// <formal parameter defn> IN
tdsWriter.writeRPCStringUnicode((preparedTypeDefinitions.length() > 0) ? preparedTypeDefinitions : null);
// <stmt> IN
tdsWriter.writeRPCStringUnicode(preparedSQL);
tdsWriter.writeRPCInt(null, 1, false);
Hi @Chrriis , I think I managed to make both drivers faster for your query, Microsoft JDBC driver being faster than jTDS most of the time against both SQL Server 2012 and 2017. Try creating the following index on your database:
USE [TC_PerfTest]
GO
CREATE NONCLUSTERED INDEX [PerfIndex]
ON [dbo].[T_DbPerfTest] ([Col2],[Col29])
INCLUDE ([Col1],[Col3],[Col4],[Col5],[Col27],[Col28])
GO
Below are the steps I took to find out how to improve performance:
set @p1=1
).Hope this helps.
Hi @ulvii,
Thanks for your investigations. Unfortunately, this does not help...
The examples I gave you are simplified examples. Our real application has the problem in 21 tables, where each contains millions of rows (gigabytes of data). We did spend a lot of time on index creation, and the actual tables happen to have some on the relevant columns.
If we were to add another index to work around the problem, then performance would not be as good because of the cost of maintaining this index. Moreover, adding an index would use more disk space and considering the amount of data in the tables it is not negligible. We also support multiple RDBMs and they mostly share the same schema and queries (with type mapping). For example, jTDS and Postgres use the same code here and Postgres happens to be slightly faster but not by much. Adding an index would decrease performance for everyone. Note that maybe other queries to other tables are affected too and we have yet to discover them. In any case, suggesting to rewrite our queries or change our schema to target the same database is hard to justify š
I don't see many options here:
Considering that a number of users will try migrating from jTDS to mssql-jdbc, wouldn't it be wise to have this prepare+exec mode for those who hit the same problem? How hard would it be to implement?
Get Microsoft to add a mode in the mssql-jdbc driver so that we can switch to prepare+exec instead of prepexec
IMHO the only sensible approach.
Do we know if and which SQL Server versions apart from 2012 suffer from the prepexec performance penalty?
It's a shame that we still have to rely on an unmaintained third-party driver in order to achieve acceptable performance.
@ulvii one driver yields good performance, the other doesn't. So it's clearly not a problem caused by missing indices.
Using the SQL Express 2012 backup that I already shared: MSvsJTDS_SQLExpress2012.zip
And the 2 test cases that I think are the same versions that I already shared: 20200211 - SqlServerJdbcDriverSlowdownTests.zip
I get the following results (times in milliseconds, sample taken out of multiple runs):
Driver | Expr. 2012 | 2014 | 2016 | 2017 | 2019 |
---|---|---|---|---|---|
MS | 13129 | 21483 | 5542 | 2391 | 10368 |
jTDS | 1079 | 2069 | 1486 | 890 | 2421 |
Slowdown | 12x | 10x | 4x | 3x | 4x |
Driver | Expr. 2012 | 2014 | 2016 | 2017 | 2019 |
---|---|---|---|---|---|
MS | 9860 | 15751 | 4196 | 2066 | 6828 |
jTDS | 1016 | 1971 | 1419 | 817 | 2031 |
Slowdown | 10x | 8x | 3x | 3x | 3x |
Do not compare numbers between different versions because these were done on very different hardware (HDD vs SDD, slow/fast CPU/RAM, etc.).
To sum it up:
Assuming that Microsofts parameter sniffing for prepexec execution plans gradually improved, the figures seem plausible to me (but still horrible).
I think that the difference in slowdown slope between test 1 and 2 is due to the lower number of parameters and is not really related to the use of batching?
I just updated the numbers above, and I correct my comment. I had a mistake in the way I was running Test2 which I resolved. Now results are consistent.
Could you run the tests using your patched mssql driver which uses sp_prepare?
With the hack to do a prepare instead of a prepexec, here are the results for the same tests:
Driver | Expr. 2012 | 2014 | 2016 | 2017 | 2019 |
---|---|---|---|---|---|
MS | 979 | 2073 | 1487 | 886 | 2209 |
jTDS | 968 | 2205 | 1466 | 871 | 2210 |
Slowdown | 1x | 1x | 1x | 1x | 1x |
Driver | Expr. 2012 | 2014 | 2016 | 2017 | 2019 |
---|---|---|---|---|---|
MS | 960 | 1930 | 1341 | 809 | 2092 |
jTDS | 921 | 1955 | 1335 | 802 | 2050 |
Slowdown | 1x | 1x | 1x | 1x | 1x |
These results are pretty clear. The question is how and when Microsoft will finally act.
jTDS is not JDBC4 compliant and will cause problems in the very near future. So people are forced to migrate to mssql-jdbc, as this is the only remaining driver without license fees which can be used to connect to a mssql database.
And this transition is already starting: e.g Atlassian Bamboo
You don't have to be a clairvoyant to predict that many companies could switch to other database products if the performance was severely affected by this forced migration.
Hi @Chrriis , Thanks for the details. We are currently exploring our options and I started a discussion with SQL Server team to understand the performance difference between sp_prepare and sp_prepexec. I will keep you updated, please stay tuned.
Hi @Chrriis ,
We looked into performance differences between sp_prepare
and sp_prepexec
with SQL Server team and the following is the conclusion we came to.
sp_prepexec
is picking a parallel plan which is scanning the heap table and then joining with a different index. sp_prepare
plan from jTDS is picking a serial plan and filtering on Col2
first. The plan difference is because in the sp_prepare
case there is no value @P0
ā it is basically NULL
and based on index statistics on Col2
the server determines serial plan is best. However in the sp_prepexec
case the @P0
has a value specified for the initial call which influences the plan. For this data set/indexes, you are getting better performance with the serial plan but there can be other cases where the parallel plan will perform better.
That being said, you can change your query to make sp_prepexec
to use serial plan by appending OPTION (OPTIMIZE FOR UNKNOWN) to your query which will instruct the Query Optimizer to use statistical data instead of the initial values for all local variables when the query is compiled and optimized. Your query would become:
UPDATE T_DbPerfTest SET Col29 = x.Col29 FROM ( SELECT Col1, LEAD(Col28) OVER (PARTITION BY Col5, Col2, Col4, Col3 ORDER BY Col28, Col1) AS Col29 FROM T_DbPerfTest WHERE Col29 IS NULL AND (Col27 = 1 OR Col28 = ?) AND Col2 = ? ) x WHERE T_DbPerfTest.Col1 = x.Col1 AND T_DbPerfTest.Col29 IS NULL AND x.Col29 = ? AND T_DbPerfTest.Col27 = 1 OPTION ( OPTIMIZE FOR UNKNOWN );
Hi @ulvii, thanks a lot for the explanation. I suspected that SQL Server was picking a plan based on the values passes as part of the first prepexec call.
Now, the big question is: is it how it should behave from the driver's point of view?
If I were to code directly the SQL Server calls (e.g. stored procedure):
I don't think that a statement created using the Java API of the driver should be created optimized for the first values that were passed in. Yes, I understand that it could be faster under some conditions, but I also acknowledge that it is not possible to know what future values will be passed in. It is by nature an "Optimize for unknown" situation. Thus, I think it is wrong for the driver to use sp_prepexec internally, unless you can execute it with "optimize for unknown" by default for all the calls.
As for adding "OPTION (OPTIMIZE FOR UNKNOWN)" to all our statements, people using an ORM cannot always do so. This can help some users but I feel it is a workaround.
Please share your thoughts, and of course do correct me if I am wrong!
I have to agree with @Chrriis
Most users of the driver rely on ORM frameworks and can't use that workaround.
@Chrriis, the more I think about it, the more I agree with you. The semantics of jdbc with prepared statements is that there is an expectation that prepare and exec are ran at different times. This is particularly true when talking about using the jdbc batching capabilities.
I predict there will be fallout because, as you correctly point out, somethings will get slower. However, from a semantic purity point of view, seems like this is the right way to go.
This might also lead into how statement caching is working. The driver should, per connection, cache the results of sp_prepare
.
sp_prepexec
should be reserved for methods like Statement#executeQuery
Now, the big question is: is it how it should behave from the driver's point of view?
Yes, I understand that it could be faster under some conditions, but I also acknowledge that it is not possible to know what future values will be passed in.
I wanted to clarify the conditions first. It is not about the different parameters being passed in, it is about how many values are queried from each parameter. Please note that, OPTION (OPTIMIZE FOR UNKNOWN)
will perform better when the values queried in a table are not evenly distributed. For example, if one of your parameters updates 10 rows and another one updates 100000 rows. In the scenarios where the number of queried values are similar, you will want to create execution plans based on initial value. This is more of a SQL server tuning scenario, rather than JDBC driver issue and I don't agree that "OPTION (OPTIMIZE FOR UNKNOWN)"
is usually a better option or is the option that the driver is supposed to use by default.
As for adding "OPTION (OPTIMIZE FOR UNKNOWN)" to all our statements, people using an ORM cannot always do so. This can help some users but I feel it is a workaround.
You can disable parameter sniffing database level and achieve similar performance results. Starting from SQL Server 2016 (13.x) SP1 you can use ALTER DATABASE SCOPED CONFIGURATION SET PARAMETER_SNIFFING = OFF ;
command, for older servers the command is DBCC TRACEON(4136,-1)
.
Please let me know what you think.
OPTION (OPTIMIZE FOR UNKNOWN) will perform better when the values queried in a table are not evenly distributed.
Isn't data being evenly distributed the exception rather than the norm? Real life is more about chaos than order š
If I create a social application, some users will have a couple of friends and posts, some will have thousands. If I create an online shop, some sellers/buyers will have a couple of items in their inventory/purchase history, some will have thousands. If I create a financial application, some instruments will have few data points (e.g. monthly data), some will have millions (traded every second).
Performance is critical for the application where this problem was identified. There are:
Parameter sniffing is important for plain statements, so this feature should not be deactivated. Prepared statements are used when working on multiple data sets which have no guarantee of having data distributed evenly.
My take is that the driver optimizing prepared statements based on the first values is an arbitrary decision, which seemed like a good idea but should be reconsidered. Such optimization should be in the hands of the users of the driver, using hints or properties of the statement. We should not have to tweak our statements to opt out, or at least we should be able to opt out using a global driver setting (hence my original suggestion of having modes: prepare+exec vs prepexec).
@ulvii is it really that hard to add an option to the driver and let the user switch the execution mode?
Hi everyone,
I discussed the issue with the team and we agree that providing an option to switch between sp_prepare
and sp_prepexec
has merit. We are backlogging the issue for now and will work on it when time permits. In the meantime, please make use of the suggestions above.
@ulvii Thanks a lot for the information. I think it is good news as it will allow more people to switch to your driver.
In the meantime, please make use of the suggestions above.
The suggestions are not practical for us, so we have no choice but to stick with jTDS until this is implemented...
I have a related performance issue:
I have a java process that inserts 2 date fields (High_School_From and High_School_To into a table in batches of 10,000 using prepared statements and the executeBatch() routine which is taking about 3 minutes to complete the executeBatch(). I have used both the Microsoft JDBC driver 8.2 and 7.4 and get the same poor performance.
Of the 10,000 records, 9,130 (91 pct) of the High_School_From had null values and 5,893 (58 pct) had null values for the High_School_To column
I ran the same process against an on-prem MS SQL server database with both of the date columns and did not experience that performance issue. The executeBatch completed in less than 5 seconds. I do expect that the on-prem server would respond faster since there would be less network latency but the performance was significantly better. It took 5 seconds to commit the batch of 10,000 vs the 3 minutes. The @@VERSION from the on-prem database returns Microsoft SQL Server 2012 (SP1) .
I tried the same process with a MS SQL server hosted on a server we have with Azure hosting running 2016 and It also took 3 minutes to complete the executeBatch.
I switched from the Miocrosoft JDBC driver to an Open Source JTDS driver and the executeBatch completed in less than 30 seconds on all the databases mentioned; the Azure database, the on-prem SQL Server 2012 and the SQL Server 2016 running on the hosted Azure server.
I do not experience the performance issue on Azure when using the MS JDBC driver if the dates are not null and using the JDBC driver. My opinion is therefore that there is an issue with how the MS JDBC driver handles the null dates in the executeBatch.
I can't use the Open Source JTDS driver in production. Do you have any suggestion on what could be causing the issue?
Driver version
7.4.1.jre12
SQL Server version
Microsoft SQL Server 2017 (RTM-GDR) (KB4505224) - 14.0.2027.2 (X64) Jun 15 2019 00:26:19 Copyright (C) 2017 Microsoft Corporation Enterprise Edition (64-bit) on Windows 10 Enterprise 10.0 (Build 18362: )
Client Operating System
Windows 10 Pro (1909)
JAVA/JVM version
OpenJDK 12.0.2+10
Table schema
A simple table with a few columns and one index
Problem description
We would like to switch from JTDS to MS SQL JDBC driver. The problem is that during long data crunching, the MS SQL driver is much slower. Our data crunching, which includes various other things, takes 77 minutes with JTDS and 104 minutes with MS SQL.
Reproduction code
I narrowed down the test case to a relatively small backup (14MB, 170MB unzipped) and a Java class.
MSvsJTDS_bak.zip SqlServerJdbcDriverSlowdownTest.zip
We got these results: 1st Try: Total time (ms): 2879 (MS) Total time (ms): 929 (JTDS)
2nd Try: Total time (ms): 2885 (MS) Total time (ms): 888 (JTDS)
3rd Try: Total time (ms): 2798 (MS) Total time (ms): 993 (JTDS)
4th Try: Total time (ms): 3057 (MS) Total time (ms): 947 (JTDS)
On a different client and host machine, we got: Total time (ms): 5681 (MS) Total time (ms): 1516 (JTDS)