dotnet / ef6

This is the codebase for Entity Framework 6 (previously maintained at https://entityframework.codeplex.com). Entity Framework Core is maintained at https://github.com/dotnet/efcore.
https://docs.microsoft.com/ef/ef6
MIT License
1.43k stars 545 forks source link

Creating a model from an existing database takes a few minutes with recent versions of SQL Server #4

Closed divega closed 6 years ago

divega commented 8 years ago

This was originally reported by @julielerman at https://github.com/Microsoft/sql-server-samples/issues/57 while trying the OLTP version of the new SQL Server 2016 sample databases, WorldWideImporters (https://github.com/Microsoft/sql-server-samples/releases/tag/wide-world-importers-v1.0).

I have created this issue on our side to follow up with the SQL Server team.

Repro steps:

  1. Create new class library project.
  2. Add new item: Entity Framework Data Model.
  3. Select connection to World Wide Importers (not DW) on SQL Server 2016 instance.
  4. Select all tables.
  5. Go (which will install EF6 package and reverse engineer tables into classes).

Observed result: It takes about 3 minutes.

Expected result: It should take about 10 seconds max.

Cause: What we know so far is that this is a regression caused by differences in SQL Server 2016's cardinality estimator (CE). There have been similar issues in SQL Server 2014, which we tried to workaround by introducing OPTION (QUERYTRACEON 9481) on our reverse engineering queries but that didn't work for customers that didn't have enough permissions to use QUERYTRACEON.

Workaround: It should be possible to temporarily downgrade the cardinality estimator on a specific database to use previous behaviors:

  1. Execute ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION=ON on the database
  2. Perform reverse engineering using the EF Wizard
  3. Execute ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION=OFF on the database

Follow up actions: We are in contact with the SQL Server team regarding this issue and the follow up action is to work with them to see if the CE can be tweaked to eliminate the regression or if there is anything we can do on the EF side.

cc @jodebrui

ErikEJ commented 8 years ago

If this part is removed from the TableColumns query, the offending query runs as expected (and returns correct number of rows)

      INNER JOIN
      INFORMATION_SCHEMA.TABLES t ON
      c.TABLE_CATALOG = t.TABLE_CATALOG AND
      c.TABLE_SCHEMA = t.TABLE_SCHEMA   AND
      c.TABLE_NAME = t.TABLE_NAME
             AND
      t.TABLE_TYPE = 'BASE TABLE'
wimr commented 8 years ago

It would be really useful if the change would make sure that the performance is fixed for all suported SQL Server versions. To give an example, we have a model with 459 tables.

Loading metadata from the database took 00:00:00.8966611. Generating the model took 00:00:14.9819927.

  • using the 120 CE on SQL Server 2014, we get the following output:

Loading metadata from the database took 00:00:00.8354508. Generating the model took 00:09:28.9413070.

If there is anything I can do to test the possible result, you can always contact me

divega commented 8 years ago

Another workaround mentioned by the SQL Server folks looking into this issue, in case it helps anyone:

update statistics sys.syscolpars
update statistics sys.sysschobjs

It seems that in fact the perf issue can be caused by stale statistics on system tables. Still investigating why that happens.

ErikEJ commented 8 years ago

@divega Update statistics had no effect for me, on a brand new WWI as per the @julielerman repro

divega commented 8 years ago

@ErikEJ apparently updating statistics can help with the "full" version of the sample database but not the "standard" version. Still pending investigation by the SQL Server folks.

divega commented 8 years ago

For the "standard" sample database the following has been reported to work:

update statistics sys.syscolpars
update statistics sys.sysschobjs
update statistics sys.syssingleobjrefs
update statistics sys.sysiscols
ErikEJ commented 8 years ago

@divega Running all 4 update stats statements has no effect on my test Server (2016 CU1) - removing the lines mentioned abovce does....

jodebrui commented 8 years ago

I updated the sample DBs at the download location: https://github.com/Microsoft/sql-server-samples/releases/tag/wide-world-importers-v1.0

On my test server reverse engineering performs well. Could you download the updated DB and try again?

divega commented 8 years ago

@ErikEJ another thing you may want to try before you download the new version of the sample database is to clear the query cache after updating statistics with dbcc freeproccache.

ErikEJ commented 8 years ago

I built my test database from scripts! - will try the backup files... (and freeproccache)

ErikEJ commented 8 years ago

Adding the actual slow query... EF6_Slow.sql.txt

ErikEJ commented 8 years ago

Tried freeproccache on my test db - no luck.

Restored WWI-Std database backup - ran if 4 secs! šŸ˜„

@jodebrui : What secret "updates" did you do to the sample database ?

jodebrui commented 8 years ago

I ran the following statements:

UPDATE STATISTICS sys.syscolpars UPDATE STATISTICS sys.sysschobjs UPDATE STATISTICS sys.syssingleobjrefs UPDATE STATISTICS sys.sysiscols GO

I incorporated them also in the source files: https://github.com/Microsoft/sql-server-samples/blob/master/samples/databases/wide-world-importers/wwi-database-scripts/4-wwi-configure-required-database-objects.sql

So if you recreate the DB from the latest source files, reverse engineering should perform well.

Thanks, Jos

From: Erik Ejlskov Jensen [mailto:notifications@github.com] Sent: Tuesday, August 16, 2016 4:44 AM To: aspnet/EntityFramework6 EntityFramework6@noreply.github.com Cc: Jos de Bruijn jodebrui@microsoft.com; Mention mention@noreply.github.com Subject: Re: [aspnet/EntityFramework6] Reverse egineering takes a few minutes on some databases using SQL Server 2016 (and probably previous versions) (#4)

Tried freeproccache on my test db - no luck.

Restored WWI-Std database backup - ran if 4 secs! šŸ˜„

@jodebruihttps://github.com/jodebrui : What secret "updates" did you do to the sample database ?

ā€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/aspnet/EntityFramework6/issues/4#issuecomment-240078458, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AMuA9km8-JVwOUkq9E6uQ7LtZwvi-Netks5qgaJogaJpZM4JYanI.

divega commented 8 years ago

I will close this issue as there doesn't seem to be anything actionable left on the EF side. However I suspect there maybe something to follow up on in the SQL Server side: why the statistics become stale on system tables? @jodebrui what do you think?

jodebrui commented 8 years ago

Makes sense to close this issue from the EF side. We are tracking the issue of stale stats in our internal bug database. If you would like public visibility into the progress of that item, you can file a Connect item against SQL Server.

ErikEJ commented 8 years ago

FWIW, I am still stuck with a "slow" (test) WWI database - stats update has no effect (only switching to LEGACY_CARDINALITY_ESTIMATION)

rjk commented 8 years ago

Changing the compatibility level to an earlier version is also reported to work, as noted at http://stackoverflow.com/questions/32700540/update-wizard-not-responding. That SO question is about the problem with SQL 2014 but the compatibility level fix worked for us on SQL 2016.

timabell commented 7 years ago

Any update from the SQL Server team on this? Do they have anywhere public to see progress?

Just tripped over this and lost half a day. Is there really no way EF can fix this without a change in SQL Server? I'm not always going to be able to get clients to update their SQL Server installs, and some projects don't work without full SQL Server.


I'm running:

Microsoft SQL Server 2016 (RTM) - 13.0.1601.5 (X64) Apr 29 2016 23:23:58 Copyright (c) Microsoft Corporation Developer Edition (64-bit) on Windows 10 Enterprise N 6.3 (Build 10586: ) (Hypervisor)

with

Entity Framework 6.1.3 Tools for Visual Studio 2015 Update 1

Updating to

Microsoft SQL Server 2016 (RTM-CU2) (KB3182270) - 13.0.2164.0 (X64) Sep 9 2016 20:13:26 Copyright (c) Microsoft Corporation Developer Edition (64-bit) on Windows 10 Enterprise N 6.3 (Build 10586: ) (Hypervisor)

did not fix the problem and there's currently nothing newer available for either.

This version of SQL Express that I was using before I installed full SQL Server works fine:

Microsoft SQL Server 2014 - 12.0.4213.0 (X64) Jun 9 2015 12:06:16 Copyright (c) Microsoft Corporation Express Edition (64-bit) on Windows NT 6.3 (Build 10586: ) (Hypervisor)

Normal "Entity Data Model" output (sql express)

Generated model file: MediumSizedModel.edmx.
Loading metadata from the database took 00:00:02.6912250.
Generating the model took 00:00:47.6834016.

Slow running under full SQL Server:

Generated model file: MediumSizedModel.edmx.
Loading metadata from the database took 00:00:03.0146046.
Generating the model took 01:05:28.3140290.

Version info from select @@version for SQL Server and "Programs and Features" for the EF tools.

Workaround

Thanks @ErikEJ for the suggestion on stackoverflow to use EntityFramework Reverse POCO Code First Generator instead, this could potentially be a great way of entirely avoiding the problem.

divega commented 7 years ago

@timabell Not sure if @jodebrui will have the time to provide an update here. Consider opening a Microsoft Connect issue as suggested in https://github.com/aspnet/EntityFramework6/issues/4#issuecomment-241534404.

timabell commented 7 years ago

https://connect.microsoft.com/SQLServer/feedback/details/3110039

not that I have any faith in connect achieving anything from past experience. (and omg what an awful website)

jodebrui commented 7 years ago

Thanks for filing the Connect item. That always helps us with visibility and prioritization of issues. Iā€™m following up with the team to get this addressed.

Jos

From: Tim Abell [mailto:notifications@github.com] Sent: Tuesday, November 1, 2016 12:00 PM To: aspnet/EntityFramework6 EntityFramework6@noreply.github.com Cc: Jos de Bruijn jodebrui@microsoft.com; Mention mention@noreply.github.com Subject: Re: [aspnet/EntityFramework6] Reverse egineering takes a few minutes on some databases using SQL Server 2016 (and probably previous versions) (#4)

https://connect.microsoft.com/SQLServer/feedback/details/3110039https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconnect.microsoft.com%2FSQLServer%2Ffeedback%2Fdetails%2F3110039&data=02%7C01%7Cjodebrui%40microsoft.com%7Cbfcd6fdb773046df713c08d4028948bd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636136236032724623&sdata=MH2Jj32sJzbtgzXt4TdB2adLwzO2yh456B5MJ2CFiUQ%3D&reserved=0

not that I have any faith in connect achieving anything from past experience.

ā€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Faspnet%2FEntityFramework6%2Fissues%2F4%23issuecomment-257661130&data=02%7C01%7Cjodebrui%40microsoft.com%7Cbfcd6fdb773046df713c08d4028948bd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636136236032724623&sdata=d6UE4ot7Fq2L39U5B7yqibFnGddjJrr0mNN8veGnWUw%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAMuA9jt92HuD9TR2tlmSYh3F4NepXwx4ks5q54wwgaJpZM4JYanI&data=02%7C01%7Cjodebrui%40microsoft.com%7Cbfcd6fdb773046df713c08d4028948bd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636136236032724623&sdata=UdZd%2BOc7TrpUIrg5AMeTgoDR4csrmb0WxiaMRD0uz0U%3D&reserved=0.

ltippmann commented 7 years ago

I know this is a closed issue but I've recently run into it in a big way with a project I'm working on. Suddenly it has started taking hours to update my model. I believe I have found the foundation of a fix.

After profiling my server while updating the model, I found it's the query that pulls the schema info that is causing problem. It's running for about 5 hours in my case. I focused in on that query and the plan it generates. This query has a couple sub-queries that are duplicated verbatim. It's the join of these of these sub-queries that appears to be the at the heart of the issue with cardinality estimator. The sub-queries, when run by themselves, run very quickly.

What I did was to run the sub-queries up front and throw the results into two temp tables. The cardinality of these table variables is estimated exactly correctly every time (it's just the number of rows in the temp table) and the joins are lightening fast. The full query now runs practically instantly.

I don't know if there's a reason temp tables cannot be used in this context. From my cursory look through code it appears this query is being built dynamically, so obviously what I've come up with isn't a drop in replacement, but if there's not a good reason the query can't be done this way, it seems like this could be good solution.

Here is the rejiggered query I came up with:

DECLARE 
    @p0 nvarchar(4000) = N'TargetDatabase'
,   @p1 nvarchar(4000) = N'dbo' 
,   @p2 nvarchar(4000) = N'__RefactorLog'

-----------------------------------------------------------------------------------------
-- Table Columns
SELECT
    quotename(c.TABLE_SCHEMA) + quotename(c.TABLE_NAME) + quotename(c.COLUMN_NAME) Id
,   quotename(c.TABLE_SCHEMA) + quotename(c.TABLE_NAME) ParentId
,   c.COLUMN_NAME   Name
,   c.ORDINAL_POSITION Ordinal
,   CAST( CASE c.IS_NULLABLE WHEN 'YES' THEN 1 WHEN 'NO' THEN 0 ELSE 0 END as bit) IsNullable
,   CASE
        WHEN c.DATA_TYPE in ('varchar', 'nvarchar', 'varbinary') 
            AND c.CHARACTER_MAXIMUM_LENGTH = -1 
            THEN c.DATA_TYPE + '(max)'
        ELSE c.DATA_TYPE
    END as TypeName
,   c.CHARACTER_MAXIMUM_LENGTH MaxLength
,   CAST(c.NUMERIC_PRECISION as integer) Precision
,   CAST(c.DATETIME_PRECISION as integer) DateTimePrecision
,   CAST(c.NUMERIC_SCALE as integer) Scale
,   c.COLLATION_CATALOG CollationCatalog
,   c.COLLATION_SCHEMA CollationSchema
,   c.COLLATION_NAME CollationName
,   c.CHARACTER_SET_CATALOG CharacterSetCatalog
,   c.CHARACTER_SET_SCHEMA CharacterSetSchema
,   c.CHARACTER_SET_NAME CharacterSetName
,   CAST(0 as bit) as IsMultiSet
,   CAST(columnproperty( object_id(quotename(c.TABLE_SCHEMA) + '.' + quotename(c.TABLE_NAME)), c.COLUMN_NAME, 'IsIdentity' ) as bit) as IsIdentity
,   CAST(columnproperty( object_id(quotename(c.TABLE_SCHEMA) + '.' + quotename(c.TABLE_NAME)), c.COLUMN_NAME, 'IsComputed' ) | CASE WHEN c.DATA_TYPE = 'timestamp' THEN 1 ELSE 0 END as bit) as IsStoreGenerated
,   c.COLUMN_DEFAULT as [Default]
INTO #TableColumns
FROM
    INFORMATION_SCHEMA.COLUMNS  c
        INNER JOIN
    INFORMATION_SCHEMA.TABLES       t   ON  c.TABLE_CATALOG = t.TABLE_CATALOG 
                                        AND c.TABLE_SCHEMA = t.TABLE_SCHEMA
                                        AND c.TABLE_NAME = t.TABLE_NAME
                                        AND t.TABLE_TYPE = 'BASE TABLE'

-----------------------------------------------------------------------------------------
-- View Columns
SELECT
    quotename(c.TABLE_SCHEMA) + quotename(c.TABLE_NAME) + quotename(c.COLUMN_NAME) Id
,   quotename(c.TABLE_SCHEMA) + quotename(c.TABLE_NAME) ParentId
,   c.COLUMN_NAME Name
,   c.ORDINAL_POSITION Ordinal
,   CAST(CASE c.IS_NULLABLE WHEN 'YES' THEN 1 WHEN 'NO' THEN 0 ELSE 0 END as bit) IsNullable
,   CASE
        WHEN c.DATA_TYPE in ('varchar', 'nvarchar', 'varbinary') 
            AND c.CHARACTER_MAXIMUM_LENGTH = -1 
            THEN c.DATA_TYPE + '(max)'
        ELSE c.DATA_TYPE
    END AS TypeName
,   c.CHARACTER_MAXIMUM_LENGTH MaxLength
,   CAST(c.NUMERIC_PRECISION as integer) Precision
,   CAST(c.DATETIME_PRECISION as integer) as DateTimePrecision
,   CAST(c.NUMERIC_SCALE as integer) Scale
,   c.COLLATION_CATALOG CollationCatalog
,   c.COLLATION_SCHEMA CollationSchema
,   c.COLLATION_NAME CollationName
,   c.CHARACTER_SET_CATALOG CharacterSetCatalog
,   c.CHARACTER_SET_SCHEMA CharacterSetSchema
,   c.CHARACTER_SET_NAME CharacterSetName
,   CAST(0 as bit) as IsMultiSet
,   CAST(columnproperty( object_id(quotename(c.TABLE_SCHEMA) + '.' + quotename(c.TABLE_NAME)), c.COLUMN_NAME, 'IsIdentity' ) as bit) as IsIdentity
,   CAST(columnproperty( object_id(quotename(c.TABLE_SCHEMA) + '.' + quotename(c.TABLE_NAME)), c.COLUMN_NAME, 'IsComputed' ) | CASE WHEN c.DATA_TYPE = 'timestamp' THEN 1 ELSE 0 END as bit) as IsStoreGenerated
,   c.COLUMN_DEFAULT [Default]
INTO #ViewColumns
FROM
    INFORMATION_SCHEMA.COLUMNS      c
        INNER JOIN
    INFORMATION_SCHEMA.VIEWS        v   ON  c.TABLE_CATALOG = v.TABLE_CATALOG 
                                            AND c.TABLE_SCHEMA = v.TABLE_SCHEMA 
                                            AND c.TABLE_NAME = v.TABLE_NAME
WHERE
    NOT (
        v.TABLE_SCHEMA = 'dbo'
        AND v.TABLE_NAME in('syssegments', 'sysconstraints')
        AND SUBSTRING(CAST(SERVERPROPERTY('productversion') as varchar(20)),1,1) = 8)

-----------------------------------------------------------------------------------------
SELECT 
    TableAndViewColumns.Ordinal AS C1
,   Tables.CatalogName AS CatalogName
,   Tables.SchemaName AS SchemaName
,   Tables.Name AS Name
,   TableAndViewColumns.Name AS C2
,   TableAndViewColumns.IsNullable AS C3
,   TableAndViewColumns.TypeName AS C4
,   TableAndViewColumns.MaxLength AS C5
,   TableAndViewColumns.Precision AS C6
,   TableAndViewColumns.DateTimePrecision AS C7
,   TableAndViewColumns.Scale AS C8
,   TableAndViewColumns.IsIdentity AS C9
,   TableAndViewColumns.IsStoreGenerated AS C10
,   CASE WHEN (Project5.C2 IS NULL) THEN cast(0 as bit) ELSE Project5.C2 END AS C11
FROM   
    (   SELECT
            quotename(TABLE_SCHEMA) + quotename(TABLE_NAME) Id
        ,   TABLE_CATALOG CatalogName
        ,   TABLE_SCHEMA SchemaName
        ,   TABLE_NAME    Name
        FROM
            INFORMATION_SCHEMA.TABLES
        WHERE
            TABLE_TYPE = 'BASE TABLE'
    ) Tables
     INNER JOIN  
    (   SELECT 
            TableColumns.Id AS Id
        ,   TableColumns.Name AS Name
        ,   TableColumns.Ordinal AS Ordinal
        ,   TableColumns.IsNullable AS IsNullable
        ,   TableColumns.TypeName AS TypeName
        ,   TableColumns.MaxLength AS MaxLength
        ,   TableColumns.Precision AS Precision
        ,   TableColumns.DateTimePrecision AS DateTimePrecision
        ,   TableColumns.Scale AS Scale
        ,   TableColumns.IsIdentity AS IsIdentity
        ,   TableColumns.IsStoreGenerated AS IsStoreGenerated
        ,   0 AS C1
        ,   TableColumns.ParentId AS ParentId
        FROM 
            #TableColumns TableColumns
            UNION ALL
            SELECT 
                ViewColumns.Id AS Id
            ,   ViewColumns.Name AS Name
            ,   ViewColumns.Ordinal AS Ordinal
            ,   ViewColumns.IsNullable AS IsNullable
            ,   ViewColumns.TypeName AS TypeName
            ,   ViewColumns.MaxLength AS MaxLength
            ,   ViewColumns.Precision AS Precision
            ,   ViewColumns.DateTimePrecision AS DateTimePrecision
            ,   ViewColumns.Scale AS Scale
            ,   ViewColumns.IsIdentity AS IsIdentity
            ,   ViewColumns.IsStoreGenerated AS IsStoreGenerated
            ,   6 AS C1
            ,   ViewColumns.ParentId AS ParentId
            FROM 
                #ViewColumns AS ViewColumns
    ) TableAndViewColumns ON 0 = TableAndViewColumns.C1
                             AND Tables.Id = TableAndViewColumns.ParentId
     LEFT OUTER JOIN  
    (   SELECT 
            UnionAll2.Id AS C1
        ,   CAST(1 as bit) AS C2
        FROM 
            (
                SELECT
                    quotename(tc.CONSTRAINT_SCHEMA) + quotename(tc.CONSTRAINT_NAME) Id
                ,   quotename(tc.TABLE_SCHEMA) + quotename(tc.TABLE_NAME) ParentId
                ,   tc.CONSTRAINT_NAME Name
                ,   tc.CONSTRAINT_TYPE ConstraintType
                ,   CAST(CASE tc.IS_DEFERRABLE WHEN 'NO' THEN 0 ELSE 1 END as bit) IsDeferrable
                ,   CAST(CASE tc.INITIALLY_DEFERRED WHEN 'NO' THEN 0 ELSE 1 END as bit) IsInitiallyDeferred
                FROM
                    INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc
                WHERE 
                    tc.TABLE_NAME IS NOT NULL
            ) KeyConstraints
             INNER JOIN  
            (
                SELECT 
                    7 AS C1
                ,   KeyConstraintColumns.ConstraintId AS ConstraintId
                ,   TableColumns.Id AS Id
                FROM  
                    (
                        SELECT
                            quotename(CONSTRAINT_SCHEMA) + quotename(CONSTRAINT_NAME) ConstraintId
                        ,   quotename(TABLE_SCHEMA) + quotename(TABLE_NAME) + quotename(COLUMN_NAME) ColumnId
                        FROM
                            INFORMATION_SCHEMA.KEY_COLUMN_USAGE
                    ) KeyConstraintColumns
                     INNER JOIN 
                    #TableColumns TableColumns ON TableColumns.Id = KeyConstraintColumns.ColumnId
                UNION ALL
                SELECT 
                    11 AS C1
                ,   Extent7.ConstraintId AS ConstraintId
                ,   ViewColumns2.Id AS Id
                FROM  
                    (   SELECT
                            CAST(NULL as nvarchar(1)) ConstraintId
                        ,   CAST(NULL as nvarchar(max)) ColumnId  
                        WHERE 1=2
                    ) AS Extent7
                     INNER JOIN 
                    #ViewColumns ViewColumns2 ON ViewColumns2.Id = Extent7.ColumnId
            ) UnionAll2 ON  (7 = UnionAll2.C1) 
                            AND (KeyConstraints.Id = UnionAll2.ConstraintId)
        WHERE KeyConstraints.ConstraintType = N'PRIMARY KEY' 
    ) Project5 ON TableAndViewColumns.Id = Project5.C1
WHERE  
    NOT (
        (Tables.CatalogName LIKE @p0) 
        AND (Tables.SchemaName LIKE @p1) 
        AND (Tables.Name LIKE @p2)
    )
MikeYeager commented 7 years ago

It appears the Connect issue for the SQL Server team has never been addressed. I up-voted it, but I'm not expecting much. I would suggest that the EF team re-open this issue since I believe they can now fix it (as of SQL Server 2016 SP1) with the new query hint OPTION (USE HINT('FORCE_LEGACY_CARDINALITY_ESTIMATION')).

ajcvickers commented 7 years ago

Re-opening so we can discuss this.

divega commented 7 years ago

@MikeYeager do you happen to be able to repro this with recent builds of SQL Server and share the repro so we can pass it to the query optimizer team?

FWIW, in the past we tried to address this with a QUERYTRACEON option that had similar effects. The advantage of that was that previous versions of SQL Server that did not understand the option passed just ignored it. The disadvantage was that it required specific privileges on the database (see http://entityframework.codeplex.com/workitem/2590 for more details). I

know that the new USE HINT approach does suffer from the privilege requirements, but I am not sure it is backwards compatible. Have you tested this?

MikeYeager commented 7 years ago

@divega Diego, I have many databases I can reproduce this on, however they all belong to clients. I'm sure I could create just about any database and encounter this problem. It seems to be an issue on every single one I have, but I have not spent any time creating a sample.

I have tried running this hint against SQL Server 2014 databases (compatibility level 120) and 2012 databases (compatibility level 110) from SSMS 2016. The queries ignore the hint and run. As far as I can tell, no specific privileges are required for this hint. At least I can't find any.

divega commented 7 years ago

Thanks @MikeYeager that sounds promising.

On the other hand, since you can currently repro the issue with several databases, would you be able to try the workarounds described here? https://support.microsoft.com/en-us/help/3189675/join-containment-assumption-in-the-new-cardinality-estimator-degrades-query-performance-in-sql-server-2014-and-later.

The SQL Server team believes they may have been able to narrow down the issue and are looking for confirmation that setting traceflag 9476 helps.

divega commented 7 years ago

I just tried appending OPTION (MERGE JOIN) on one of these queries and it seems to help.

@MikeYeager @tranceporter @ErikEJ @lukeatron could you confirm?

divega commented 7 years ago

These results show total elapsed seconds against WideWorldImporters with SQL Server 2016 SP1 on my box.

Option First row Last row (#589)
Original query 21 256
OPTION (MERGE JOIN) 0 0
OPTION (USE HINT('FORCE_LEGACY_CARDINALITY_ESTIMATION')) 0 0
OPTION (USE HINT('ASSUME_JOIN_PREDICATE_DEPENDS_ON_FILTERS')) 0 7

We should test with older versions of SQL Server to verify that it does not cause regressions, but OPTION(MERGE JOIN) is looking promising.

MikeYeager commented 7 years ago

Nice! That looks like a winner!

sjh37 commented 7 years ago

In the EF reverse poco generator I have a flag which users can set to true:

// If SqlServer 2014 appears frozen / take a long time when this file is saved,
// try setting this to true (you will also need elevated privileges).
IncludeQueryTraceOn9481Flag = false;

This appends the following SQL to the queries to obtain the tables/columns/views:

OPTION (QUERYTRACEON 9481)

You should also take a look at the SQL I have in this generator. It's been hand-crafted, peer reviewed, and optimised.

ErikEJ commented 7 years ago

@sjh37 FYI: Diego's suggestion is based on the community effort in the EF Reverse POCO project!

tranceporter commented 7 years ago

@divega Here is the thread where we discussed and tried a fair number of options including MERGE JOIN. The optimised query Simon pointed to above, was derived from the this discussion.

https://github.com/sjh37/EntityFramework-Reverse-POCO-Code-First-Generator/issues/262#issuecomment-298693275

divega commented 7 years ago

@tranceporter @sjh37 like @ErikEJ said, I have seen that thread (Erik pointed me to it in a tweet) and I noticed that you have handcrafted a query for the reverse POCO generator, which is great. What I am trying to solve here, is how to workaround the SQL Server performance regression with minimal changes to our current schema discovery queries. We actually attempted this long time ago with QUERYTRACEON 9481 but had to revert because QUERYTRACEON requires admin permissions on the database server. We are now evaluating doing it with the options above because they don't have the permissions requirements.

I also learned in your discussion that a simple switch to a MERGE JOIN helped, which is why I decided to try with OPTIONS (MERGE JOIN).

divega commented 7 years ago

Clearing up milestone to discuss taking this earlier.

RichardMooreClear commented 6 years ago

Using VS 2017 Ver 15.5.7, Microsoft SQL Server 2017 (RTM) - 14.0.1000.169 (X64) Aug 22 2017 17:04:49 Copyright (C) 2017 Microsoft Corporation Developer Edition (64-bit) on Windows 10 Pro 10.0 (Build 16299: ) , and EF 6.2.0 I had similar issues where it would just hangs and not create the model. Tried a couple if different methods and they all did the same thing. Change DB compatibility level to 110 and it worked Generated model file: STPDataModel.edmx. Loading metadata from the database took 00:00:00.7313588. Generating the model took 00:00:01.8303913. Added the connection string to the App.Config file. Writing the .edmx file took 00:00:00.0091903.

lajones commented 6 years ago

Notes for self:

  1. Use the WideWorldImporter_Full.bak file from the link.
  2. Can repro in SQL Server 2016 SP1
  3. The query that is sent to the database is:
    SELECT 
    [UnionAll1].[Ordinal] AS [C1], 
    [Extent1].[CatalogName] AS [CatalogName], 
    [Extent1].[SchemaName] AS [SchemaName], 
    [Extent1].[Name] AS [Name], 
    [UnionAll1].[Name] AS [C2], 
    [UnionAll1].[IsNullable] AS [C3], 
    [UnionAll1].[TypeName] AS [C4], 
    [UnionAll1].[MaxLength] AS [C5], 
    [UnionAll1].[Precision] AS [C6], 
    [UnionAll1].[DateTimePrecision] AS [C7], 
    [UnionAll1].[Scale] AS [C8], 
    [UnionAll1].[IsIdentity] AS [C9], 
    [UnionAll1].[IsStoreGenerated] AS [C10], 
    CASE WHEN ([Project5].[C2] IS NULL) THEN cast(0 as bit) ELSE [Project5].[C2] END AS [C11]
    FROM   (
        SELECT
        quotename(TABLE_SCHEMA) + quotename(TABLE_NAME) [Id]
        ,   TABLE_CATALOG [CatalogName]
        ,   TABLE_SCHEMA [SchemaName]
        ,   TABLE_NAME    [Name]
        FROM
        INFORMATION_SCHEMA.TABLES
        WHERE
        TABLE_TYPE = 'BASE TABLE'
      ) AS [Extent1]
    INNER JOIN  (SELECT 
        [Extent2].[Id] AS [Id], 
        [Extent2].[Name] AS [Name], 
        [Extent2].[Ordinal] AS [Ordinal], 
        [Extent2].[IsNullable] AS [IsNullable], 
        [Extent2].[TypeName] AS [TypeName], 
        [Extent2].[MaxLength] AS [MaxLength], 
        [Extent2].[Precision] AS [Precision], 
        [Extent2].[DateTimePrecision] AS [DateTimePrecision], 
        [Extent2].[Scale] AS [Scale], 
        [Extent2].[IsIdentity] AS [IsIdentity], 
        [Extent2].[IsStoreGenerated] AS [IsStoreGenerated], 
        0 AS [C1], 
        [Extent2].[ParentId] AS [ParentId]
        FROM (
          SELECT
          quotename(c.TABLE_SCHEMA) + quotename(c.TABLE_NAME) + quotename(c.COLUMN_NAME) [Id]
          ,   quotename(c.TABLE_SCHEMA) + quotename(c.TABLE_NAME)                             [ParentId]
          ,   c.COLUMN_NAME   [Name]
          ,   c.ORDINAL_POSITION [Ordinal]
          ,   CAST( CASE c.IS_NULLABLE WHEN 'YES' THEN 1 WHEN 'NO' THEN 0 ELSE 0 END as bit) [IsNullable]
          ,   CASE
          WHEN c.DATA_TYPE in ('varchar', 'nvarchar', 'varbinary') and
          c.CHARACTER_MAXIMUM_LENGTH = -1 THEN
          c.DATA_TYPE + '(max)'
          ELSE
          c.DATA_TYPE
          END
          as [TypeName]
          ,   c.CHARACTER_MAXIMUM_LENGTH [MaxLength]
          ,   CAST(c.NUMERIC_PRECISION as integer) [Precision]
          ,   CAST(c.DATETIME_PRECISION as integer)[DateTimePrecision]
          ,   CAST(c.NUMERIC_SCALE as integer) [Scale]
          ,   c.COLLATION_CATALOG [CollationCatalog]
          ,   c.COLLATION_SCHEMA [CollationSchema]
          ,   c.COLLATION_NAME [CollationName]
          ,   c.CHARACTER_SET_CATALOG [CharacterSetCatalog]
          ,   c.CHARACTER_SET_SCHEMA [CharacterSetSchema]
          ,   c.CHARACTER_SET_NAME [CharacterSetName]
          ,   CAST(0 as bit) as [IsMultiSet]
          ,   CAST(columnproperty( object_id(quotename(c.TABLE_SCHEMA) + '.' + quotename(c.TABLE_NAME)), c.COLUMN_NAME, 'IsIdentity' ) as bit) as [IsIdentity]
          ,   CAST(columnproperty( object_id(quotename(c.TABLE_SCHEMA) + '.' + quotename(c.TABLE_NAME)), c.COLUMN_NAME, 'IsComputed' ) | CASE WHEN c.DATA_TYPE = 'timestamp' THEN 1 ELSE 0 END as bit) as [IsStoreGenerated]
          , c.COLUMN_DEFAULT as [Default]
          FROM
          INFORMATION_SCHEMA.COLUMNS c
          INNER JOIN
          INFORMATION_SCHEMA.TABLES t ON
          c.TABLE_CATALOG = t.TABLE_CATALOG AND
          c.TABLE_SCHEMA = t.TABLE_SCHEMA   AND
          c.TABLE_NAME = t.TABLE_NAME       AND
          t.TABLE_TYPE = 'BASE TABLE'
      ) AS [Extent2]
    UNION ALL
        SELECT 
        [Extent3].[Id] AS [Id], 
        [Extent3].[Name] AS [Name], 
        [Extent3].[Ordinal] AS [Ordinal], 
        [Extent3].[IsNullable] AS [IsNullable], 
        [Extent3].[TypeName] AS [TypeName], 
        [Extent3].[MaxLength] AS [MaxLength], 
        [Extent3].[Precision] AS [Precision], 
        [Extent3].[DateTimePrecision] AS [DateTimePrecision], 
        [Extent3].[Scale] AS [Scale], 
        [Extent3].[IsIdentity] AS [IsIdentity], 
        [Extent3].[IsStoreGenerated] AS [IsStoreGenerated], 
        6 AS [C1], 
        [Extent3].[ParentId] AS [ParentId]
        FROM (
          SELECT
          quotename(c.TABLE_SCHEMA) + quotename(c.TABLE_NAME) + quotename(c.COLUMN_NAME) [Id]
          ,   quotename(c.TABLE_SCHEMA) + quotename(c.TABLE_NAME)                             [ParentId]
          ,   c.COLUMN_NAME   [Name]
          ,   c.ORDINAL_POSITION [Ordinal]
          ,   CAST( CASE c.IS_NULLABLE WHEN 'YES' THEN 1 WHEN 'NO' THEN 0 ELSE 0 END as bit) [IsNullable]
          ,   CASE
          WHEN c.DATA_TYPE in ('varchar', 'nvarchar', 'varbinary') and
          c.CHARACTER_MAXIMUM_LENGTH = -1 THEN
          c.DATA_TYPE + '(max)'
          ELSE
          c.DATA_TYPE
          END
          as [TypeName]
          ,   c.CHARACTER_MAXIMUM_LENGTH [MaxLength]
          ,   CAST(c.NUMERIC_PRECISION as integer) [Precision]
          ,   CAST(c.DATETIME_PRECISION as integer) as [DateTimePrecision]
          ,   CAST(c.NUMERIC_SCALE as integer) [Scale]
          ,   c.COLLATION_CATALOG [CollationCatalog]
          ,   c.COLLATION_SCHEMA [CollationSchema]
          ,   c.COLLATION_NAME [CollationName]
          ,   c.CHARACTER_SET_CATALOG [CharacterSetCatalog]
          ,   c.CHARACTER_SET_SCHEMA [CharacterSetSchema]
          ,   c.CHARACTER_SET_NAME [CharacterSetName]
          ,   CAST(0 as bit) as [IsMultiSet]
          ,   CAST(columnproperty( object_id(quotename(c.TABLE_SCHEMA) + '.' + quotename(c.TABLE_NAME)), c.COLUMN_NAME, 'IsIdentity' ) as bit) as [IsIdentity]
          ,   CAST(columnproperty( object_id(quotename(c.TABLE_SCHEMA) + '.' + quotename(c.TABLE_NAME)), c.COLUMN_NAME, 'IsComputed' ) | CASE WHEN c.DATA_TYPE = 'timestamp' THEN 1 ELSE 0 END as bit) as [IsStoreGenerated]
          ,   c.COLUMN_DEFAULT [Default]
          FROM
          INFORMATION_SCHEMA.COLUMNS c
          INNER JOIN
          INFORMATION_SCHEMA.VIEWS v ON
          c.TABLE_CATALOG = v.TABLE_CATALOG AND
          c.TABLE_SCHEMA = v.TABLE_SCHEMA AND
          c.TABLE_NAME = v.TABLE_NAME
          WHERE
          NOT (v.TABLE_SCHEMA = 'dbo'
          AND v.TABLE_NAME in('syssegments', 'sysconstraints')
          AND SUBSTRING(CAST(SERVERPROPERTY('productversion') as varchar(20)),1,1) = 8)
      ) AS [Extent3]) AS [UnionAll1] ON (0 = [UnionAll1].[C1]) AND ([Extent1].[Id] = [UnionAll1].[ParentId])
    LEFT OUTER JOIN  (SELECT 
        [UnionAll2].[Id] AS [C1], 
        cast(1 as bit) AS [C2]
        FROM  (
        SELECT
        quotename(tc.CONSTRAINT_SCHEMA) + quotename(tc.CONSTRAINT_NAME) [Id]
        , quotename(tc.TABLE_SCHEMA) + quotename(tc.TABLE_NAME) [ParentId]
        ,   tc.CONSTRAINT_NAME [Name]
        ,   tc.CONSTRAINT_TYPE [ConstraintType]
        ,   CAST(CASE tc.IS_DEFERRABLE WHEN 'NO' THEN 0 ELSE 1 END as bit) [IsDeferrable]
        ,   CAST(CASE tc.INITIALLY_DEFERRED WHEN 'NO' THEN 0 ELSE 1 END as bit) [IsInitiallyDeferred]
        FROM
        INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc
        WHERE tc.TABLE_NAME IS NOT NULL
      ) AS [Extent4]
        INNER JOIN  (SELECT 
            7 AS [C1], 
            [Extent5].[ConstraintId] AS [ConstraintId], 
            [Extent6].[Id] AS [Id]
            FROM  (
        SELECT
        quotename(CONSTRAINT_SCHEMA) + quotename(CONSTRAINT_NAME) [ConstraintId]
        ,   quotename(TABLE_SCHEMA) + quotename(TABLE_NAME) + quotename(COLUMN_NAME) [ColumnId]
        FROM
        INFORMATION_SCHEMA.KEY_COLUMN_USAGE
      ) AS [Extent5]
            INNER JOIN (
          SELECT
          quotename(c.TABLE_SCHEMA) + quotename(c.TABLE_NAME) + quotename(c.COLUMN_NAME) [Id]
          ,   quotename(c.TABLE_SCHEMA) + quotename(c.TABLE_NAME)                             [ParentId]
          ,   c.COLUMN_NAME   [Name]
          ,   c.ORDINAL_POSITION [Ordinal]
          ,   CAST( CASE c.IS_NULLABLE WHEN 'YES' THEN 1 WHEN 'NO' THEN 0 ELSE 0 END as bit) [IsNullable]
          ,   CASE
          WHEN c.DATA_TYPE in ('varchar', 'nvarchar', 'varbinary') and
          c.CHARACTER_MAXIMUM_LENGTH = -1 THEN
          c.DATA_TYPE + '(max)'
          ELSE
          c.DATA_TYPE
          END
          as [TypeName]
          ,   c.CHARACTER_MAXIMUM_LENGTH [MaxLength]
          ,   CAST(c.NUMERIC_PRECISION as integer) [Precision]
          ,   CAST(c.DATETIME_PRECISION as integer)[DateTimePrecision]
          ,   CAST(c.NUMERIC_SCALE as integer) [Scale]
          ,   c.COLLATION_CATALOG [CollationCatalog]
          ,   c.COLLATION_SCHEMA [CollationSchema]
          ,   c.COLLATION_NAME [CollationName]
          ,   c.CHARACTER_SET_CATALOG [CharacterSetCatalog]
          ,   c.CHARACTER_SET_SCHEMA [CharacterSetSchema]
          ,   c.CHARACTER_SET_NAME [CharacterSetName]
          ,   CAST(0 as bit) as [IsMultiSet]
          ,   CAST(columnproperty( object_id(quotename(c.TABLE_SCHEMA) + '.' + quotename(c.TABLE_NAME)), c.COLUMN_NAME, 'IsIdentity' ) as bit) as [IsIdentity]
          ,   CAST(columnproperty( object_id(quotename(c.TABLE_SCHEMA) + '.' + quotename(c.TABLE_NAME)), c.COLUMN_NAME, 'IsComputed' ) | CASE WHEN c.DATA_TYPE = 'timestamp' THEN 1 ELSE 0 END as bit) as [IsStoreGenerated]
          , c.COLUMN_DEFAULT as [Default]
          FROM
          INFORMATION_SCHEMA.COLUMNS c
          INNER JOIN
          INFORMATION_SCHEMA.TABLES t ON
          c.TABLE_CATALOG = t.TABLE_CATALOG AND
          c.TABLE_SCHEMA = t.TABLE_SCHEMA   AND
          c.TABLE_NAME = t.TABLE_NAME       AND
          t.TABLE_TYPE = 'BASE TABLE'
      ) AS [Extent6] ON [Extent6].[Id] = [Extent5].[ColumnId]
        UNION ALL
            SELECT 
            11 AS [C1], 
            [Extent7].[ConstraintId] AS [ConstraintId], 
            [Extent8].[Id] AS [Id]
            FROM  (
        SELECT
        CAST(NULL as nvarchar(1))     [ConstraintId]
        , CAST(NULL as nvarchar(max)) [ColumnId]  
        WHERE 1=2
      ) AS [Extent7]
            INNER JOIN (
          SELECT
          quotename(c.TABLE_SCHEMA) + quotename(c.TABLE_NAME) + quotename(c.COLUMN_NAME) [Id]
          ,   quotename(c.TABLE_SCHEMA) + quotename(c.TABLE_NAME)                             [ParentId]
          ,   c.COLUMN_NAME   [Name]
          ,   c.ORDINAL_POSITION [Ordinal]
          ,   CAST( CASE c.IS_NULLABLE WHEN 'YES' THEN 1 WHEN 'NO' THEN 0 ELSE 0 END as bit) [IsNullable]
          ,   CASE
          WHEN c.DATA_TYPE in ('varchar', 'nvarchar', 'varbinary') and
          c.CHARACTER_MAXIMUM_LENGTH = -1 THEN
          c.DATA_TYPE + '(max)'
          ELSE
          c.DATA_TYPE
          END
          as [TypeName]
          ,   c.CHARACTER_MAXIMUM_LENGTH [MaxLength]
          ,   CAST(c.NUMERIC_PRECISION as integer) [Precision]
          ,   CAST(c.DATETIME_PRECISION as integer) as [DateTimePrecision]
          ,   CAST(c.NUMERIC_SCALE as integer) [Scale]
          ,   c.COLLATION_CATALOG [CollationCatalog]
          ,   c.COLLATION_SCHEMA [CollationSchema]
          ,   c.COLLATION_NAME [CollationName]
          ,   c.CHARACTER_SET_CATALOG [CharacterSetCatalog]
          ,   c.CHARACTER_SET_SCHEMA [CharacterSetSchema]
          ,   c.CHARACTER_SET_NAME [CharacterSetName]
          ,   CAST(0 as bit) as [IsMultiSet]
          ,   CAST(columnproperty( object_id(quotename(c.TABLE_SCHEMA) + '.' + quotename(c.TABLE_NAME)), c.COLUMN_NAME, 'IsIdentity' ) as bit) as [IsIdentity]
          ,   CAST(columnproperty( object_id(quotename(c.TABLE_SCHEMA) + '.' + quotename(c.TABLE_NAME)), c.COLUMN_NAME, 'IsComputed' ) | CASE WHEN c.DATA_TYPE = 'timestamp' THEN 1 ELSE 0 END as bit) as [IsStoreGenerated]
          ,   c.COLUMN_DEFAULT [Default]
          FROM
          INFORMATION_SCHEMA.COLUMNS c
          INNER JOIN
          INFORMATION_SCHEMA.VIEWS v ON
          c.TABLE_CATALOG = v.TABLE_CATALOG AND
          c.TABLE_SCHEMA = v.TABLE_SCHEMA AND
          c.TABLE_NAME = v.TABLE_NAME
          WHERE
          NOT (v.TABLE_SCHEMA = 'dbo'
          AND v.TABLE_NAME in('syssegments', 'sysconstraints')
          AND SUBSTRING(CAST(SERVERPROPERTY('productversion') as varchar(20)),1,1) = 8)
      ) AS [Extent8] ON [Extent8].[Id] = [Extent7].[ColumnId]) AS [UnionAll2] ON (7 = [UnionAll2].[C1]) AND ([Extent4].[Id] = [UnionAll2].[ConstraintId])
        WHERE [Extent4].[ConstraintType] = N'PRIMARY KEY' ) AS [Project5] ON [UnionAll1].[Id] = [Project5].[C1]
    WHERE [Extent1].[Name] LIKE N'%'
  4. This originates from EntityStoreSchemaGeneratorDatabaseSchemaLoader.LoadTableDetails() and .LoadViewDetails(), specifically see the definitions of TableDetailsSql and ViewDetailsSql (which are ESQL).
lajones commented 6 years ago

Added DbInterceptor which adds OPTION(MERGE JOIN) to metadata queries sent to a SQL Server database.

Note: this was added as a quirk i.e. you can this turn this addition off by adding the following to the devenv.exe.config file for the instance of VS you are using:

<appSettings>
  <add key="Switch.Microsoft.Data.Entity.Design.DoNotUseSqlServerMetadataMergeJoins" value="true"/>
</appSettings>

See #503 for details.

mkamoski commented 4 years ago
  1. Execute ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION=ON on the database
  2. Perform reverse engineering using the EF Wizard
  3. Execute ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION=OFF on the database

Setting LEGACY_CARDINALITY_ESTIMATION like suggested is NOT working for me.

mkamoski commented 4 years ago

FYI, "Change DB compatibility level to 110" did NOT work for me. Sql Server 2016 and Visual Studio 2019.