Open svengeance opened 4 years ago
Please share the modified query (in a PR for example)
Have you tried updating statistics ?
but I understand we're not alone here.
How so?
This is nice - I was going to make an issue on the EF Core Power Tools repo, but I figured I'd track the issue all the way down. Good to see you!
Have you tried updating statistics ?
I did try updating statistics all on sys.* tables, unfortunately to no avail.
but I understand we're not alone here.
How so?
That was an obtuse way of saying "Despite having a large database we're probably not the only ones who do, and others may experience this issue" - my bad.
Please share the modified query (in a PR for example)
I will have this up shortly.
Are you 100% sure you did the stats update: https://github.com/dotnet/ef6/issues/4#issuecomment-239590161 - and you may have to flush the plan cache after doing this!
You could also try this: https://github.com/dotnet/ef6/issues/4#issuecomment-581897617
Reran index updates on sys.* tables from the first comment, cleared the cache as advised, toggled on/off the cardinality estimation from the second comment - all said and done I was able to reduce the index query down to roughly 40 seconds from 120 seconds. Likewise, the number of rows read from bottleneck portion of the execution plan dropped from 125million to about 50 million, indicating a linear scaling at that bottleneck.
As I said in the PR I have had success generating a table/index script from MSSQL's Database->Tasks->Generate Script..
feature, loading it into a new database (on the same server, mind), and scaffolding from there, where the query runs instantly. I can only believe it has to be a combination of data/indexes at this point.
Thanks for the additional informaton, sadly 40 seconds is still above the default of 30 seconds.
Fixing / conributing to this will benefit a much wider auience: https://github.com/dotnet/SqlClient/issues/614
I would be delighted, would just have to get acquainted with the codebase, whereas modifying the scaffolder I feel like I have an overall grasp on the pattern.
Agree, the issue is just that you would have to sprinkle the codebase with:
command.CommandTimeout = customTimeout;
It may be simpler if my understanding is correct. There are three operations types a command can do - Migrations, DbContext, and Database. The only operation in DatabaseOperations is Scaffold, and it is the only Operations type that does not give the user any chance to set a custom timeout (IDesignTimeDbContextFactory is injected for the other 2 for context resolution).
Of course I'd rather fix this kind of thing at the source. I'm really surprised timeouts in the connection string aren't supported already to be honest
Multiple SqlCommands are created in the scaffolder, and the timeout must be set for each. Are you confusing Connection timeout with Command timeout?
Ah sure - no I wasn't, I just wanted to make sure we were both thinking of the same thing with respect to setting the timeout on multiple commands. I thought you were referring to multiple commands as in { migrate, scaffold, update, add-migration }. Then yes definitely, at the end of the day inside of the DatabaseModelFactory I need to either aggregate command resolution to one method and set the timeout there, or set the timeout in multiple places.
If you are using EF Core Power Tools, there is a couple of things you can try:
Filter Schemas before getting the full table list.
Try the legacy build, available from the Release Notes page. This version uses a different approach to getting the table list - I consider reverting to that for SQL Server.
I've gotten your nightly build and have used schema filters successfully on small schemas (<50 tables), however dbo was (and remains) an insurmountable beast.
I will try an older build and see how that goes.
For my direct issue there's a handful of solutions - create a fresh schema-only database that's not a dacpac restore, resolve my PR & wait for it's deploy, add configuration into Power Tools to inject my own Model Factory, add command timeouts to SQL Connection Strings, add command timeouts into dotnet-ef
. I know you're taking point on SQL connection string timeouts - is there any desire for help on the other ones, or should I sit tight?
I'm running into the same issue with latest version of EF core power tools and a fairly large SQL Server database with hundreds of tables. Seems to timeout on the call to GetIndexes.
Tried version 2.4.51 from the notes section and it worked fine.
@LynxWall If you have the time, would you be able to confirm that the PR I have open would fix this? The following link contains the current query used to retrieve indexes, as well as the proposed query. The two should have identical results, but hopefully the second has a better execution time. Would you be able to run them and report back?
If neither of the above queries are the source of your issue, I have a snapshot of ALL queries executed in database scaffolding at the following link. Again, time willing, would you mind tracking down which is particularly slow?
https://hatebin.com/iuqgoceqmu
I have not been able to replicate this artificially, so it would be nice to see someone else for whom the PR would solve an issue.
Hi @svengeance I ran both queries against our database and there was a significant difference in performance.
Old:
New:
As you can see from the screenshots it went from 1:34 to 0:02. Both results returned 7,182 rows and with quick scan they looked identical. I also ran both queries back-to-back and there was no difference in performance between the two. Old query stayed consistent at 1:34 while the new query was consistent at 0:01 when executed multiple times.
Thanks... hopefully this PR gets approved and incorporated!
@LynxWall did you try the stats update?
I have bitten the bullet, and have added a fix for this to the latest daily build of EF Core Power Tools, see https://github.com/ErikEJ/EFCorePowerTools/pull/495
@ErikEJ I did not try the stats update... was working with a development database that's refreshed weekly from production and shared by a lot of developers.
Will pull the latest daily build and test... Thanks!
Tried the latest daily build and am still getting a timeout, although it is different this time.
System.Exception: Table list error:
Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
---> System.ComponentModel.Win32Exception (258): The wait operation timed out.
at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action
1 wrapCloseInAction)
at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
at Microsoft.Data.SqlClient.SqlDataReader.TrySetMetaData(_SqlMetaDataSet metaData, Boolean moreInfo)
at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
at Microsoft.Data.SqlClient.SqlDataReader.TryConsumeMetaData()
at Microsoft.Data.SqlClient.SqlDataReader.get_MetaData()
at Microsoft.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption, Boolean shouldCacheForAlwaysEncrypted)
at Microsoft.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean isAsync, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest)
at Microsoft.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, TaskCompletionSource1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry, String method) at Microsoft.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method) at Microsoft.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior) at Microsoft.Data.SqlClient.SqlCommand.ExecuteDbDataReader(CommandBehavior behavior) at Microsoft.EntityFrameworkCore.SqlServer.Scaffolding.Internal.SqlServerDatabaseModelFactory.GetIndexes(DbConnection connection, IReadOnlyList
1 tables, String tableFilter)
at Microsoft.EntityFrameworkCore.SqlServer.Scaffolding.Internal.SqlServerDatabaseModelFactory.GetTables(DbConnection connection, Func3 tableFilter, IReadOnlyDictionary
2 typeAliases)
at Microsoft.EntityFrameworkCore.SqlServer.Scaffolding.Internal.SqlServerDatabaseModelFactory.Create(DbConnection connection, DatabaseModelFactoryOptions options)
at Microsoft.EntityFrameworkCore.SqlServer.Scaffolding.Internal.SqlServerDatabaseModelFactory.Create(String connectionString, DatabaseModelFactoryOptions options)
at ReverseEngineer20.TableListBuilder.GetTableDefinitions() in C:\Code\EFCorePowerTools\src\GUI\RevEng.Core\TableListBuilder.cs:line 39
at efreveng.Program.Main(String[] args) in C:\Code\EFCorePowerTools\src\GUI\efreveng\Program.cs:line 33
ClientConnectionId:c0aa0b09-bbc7-4cbc-89e1-c2318de91d0a
Error Number:-2,State:0,Class:11
at EFCorePowerTools.Handlers.ReverseEngineer.ResultDeserializer.BuildTableResult(String output) in C:\projects\efcorepowertools\src\GUI\EFCorePowerTools\Handlers\ReverseEngineer\ResultDeserializer.cs:line 47
at ReverseEngineer20.ReverseEngineer.EfRevEngLauncher.GetTablesInternal(String arguments) in C:\projects\efcorepowertools\src\GUI\EFCorePowerTools\Handlers\ReverseEngineer\EfRevEngLauncher.cs:line 64
at ReverseEngineer20.ReverseEngineer.EfRevEngLauncher.GetTables(String connectionString, DatabaseType databaseType, SchemaInfo[] schemas) in C:\projects\efcorepowertools\src\GUI\EFCorePowerTools\Handlers\ReverseEngineer\EfRevEngLauncher.cs:line 40
at EFCorePowerTools.Handlers.ReverseEngineer.TableListBuilder.GetTableDefinitions(Boolean useEFCore5) in C:\projects\efcorepowertools\src\GUI\EFCorePowerTools\Handlers\ReverseEngineer\TableListBuilder.cs:line 39
at EFCorePowerTools.Handlers.ReverseEngineerHandler.GetTables(String connectionString, DatabaseType databaseType, Boolean useEFCore5, SchemaInfo[] schemas) in C:\projects\efcorepowertools\src\GUI\EFCorePowerTools\Handlers\ReverseEngineerHandler.cs:line 351
at EFCorePowerTools.Handlers.ReverseEngineerHandler.
@LynxWall Are you using EF Core 5 (I have only implemented this for EF Core3) ?
@LynxWall "although it is different this time." - how is it different, and from what?
GetTables is timing out rather than index.
I am using EF Core 3... it looked like it was timing out on a different call but will run the older version to verify.
Compared error from version 2.4.188 with 2.4.190 and they're the same. The difference was in formatting. The output from the 190 release wasn't breaking on all of the at... entries. The last call in Scaffolding is GetIndexes()
It does not look like you are using the new code - I will investigate.
You should be using SqlServerDatabaseFasterModelFactory !
Was wondering if that was the case... I had pulled latest daily build that looked like it was updated after your merge. Version was 2.4.190
^ Confirming with Lynx that 2.4.190 has a stacktrace emanating from SqlServerDatabaseModelFactory.
Microsoft.EntityFrameworkCore.SqlServer.Scaffolding.Internal.SqlServerDatabaseModelFactory.GetIndexes(DbConnection connection, IReadOnlyList`1 tables, String tableFilter)
Newer release available, would be grateful if you could give it a try! (build 192)
Just tried ... verified that I had 2.4.192 installed and am still seeing the SqqlServerDatabaseModelFactory call:
at Microsoft.EntityFrameworkCore.SqlServer.Scaffolding.Internal.SqlServerDatabaseModelFactory.GetIndexes(DbConnection connection, IReadOnlyList`1 tables, String tableFilter)
Let's continue our discussion here: https://github.com/ErikEJ/EFCorePowerTools/issues/498
@smitpatel It is actually GetIndexes that time out...
I have a better idea, why not just have timeout be 0 so then it works for all cases (the default for sql server)?
Reverse-engineering a sufficiently large database utilizes an inefficient query to retrieve indexes which causes the reveng to fail. The query is emitted here
Please note that I have a resolution ready to PR if this issue is deemed valid and reasonable.
Steps to reproduce
- Obtain a sufficiently large database (note: not the easiest) - Attempt to scaffold the context - ??? - Timeout
Further technical details
EF Core version: 3.1.7 Database provider: Microsoft.EntityFrameworkCore.SqlServer Target framework: Net Core 3.1 Operating system: Win10-x64 IDE: VS 2019 16.7.1
The good stuff
Straight up, our database is a monolith, but I understand we're not alone here. Here is a snapshot of where we stand -
The query to get Indexes, in all its glory, takes over a minute. Anything over 30 seconds is unacceptable given that's the error condition.
By locating the SQL ran by
SqlServerDatabaseModelFactory.GetIndexes()
through MS SQL Profiler and running it myself, the following time is taken:Inspecting the execution plan leads to:
And the top-right step of the plan reveals
Over 120 million records read for the tight loop here.
Resolution
In my mind I see three resolution paths.
Easiest
Extend the timeout of the DbCommand(s) created such that scaffolding is given a reasonable amount of time to complete (how long is reasonable?)
Easy
Refactor the query. I have already done so, with the following result, having compared the before/after result for equality.
Medium
Refactor Microsoft.EntityFrameworkCore.Design.Internal.DatabaseOperations to use DbContextOperations to retrieve the IDesignTimeDbContextFactory from the user.* This allows the user to set arbitrary timeouts by utilizing the DbConnection created by their context, and allow users to specify startup assemblies instead of a connection string.
Note, I have tried to grock through this code within a reasonable amount of time, but to the best of my understanding it does not already take into account the user's design-time factory (source), while MigrationsOperations does here.
If you guys feel like this is a valid enough issue, I can submit a PR that modifies the query issued by GetIndexes(), which should speed up this process immensely for larger databases while leaving smaller databases unaffected.