microsoft / sqlmanagementobjects

Sql Management Objects, an API for scripting and managing SQL Server and Azure SQL Database
Other
130 stars 21 forks source link

Scripting will execute slow SQL query despite using PrefetchObjects #165

Open insikt-jonas opened 4 months ago

insikt-jonas commented 4 months ago

We are using this package to script our database.

Upgrading to the most recent release (171.30.0), from a really old version, we have experienced slower performance when scripting. This is due to PrefetchObjects() not seeming to fetch all that is needed. Or alternatively, that the scripter doesn't use the prefetched data.

For every Table that is scripted, the following slow query is executed to fetch indices, primary keys, etc:

exec sp_executesql N'SELECT
i.name AS [Name],
CAST(ISNULL(si.bounding_box_xmax,0) AS float(53)) AS [BoundingBoxXMax],
CAST(ISNULL(si.bounding_box_xmin,0) AS float(53)) AS [BoundingBoxXMin],
CAST(ISNULL(si.bounding_box_ymax,0) AS float(53)) AS [BoundingBoxYMax],
CAST(ISNULL(si.bounding_box_ymin,0) AS float(53)) AS [BoundingBoxYMin],
CAST(case when (i.type=7) then hi.bucket_count else 0 end AS int) AS [BucketCount],
CAST(ISNULL(si.cells_per_object,0) AS int) AS [CellsPerObject],
CAST(i.compression_delay AS int) AS [CompressionDelay],
~i.allow_page_locks AS [DisallowPageLocks],
~i.allow_row_locks AS [DisallowRowLocks],

        CASE WHEN ((SELECT tbli.is_memory_optimized FROM sys.tables tbli WHERE tbli.object_id = i.object_id)=1 or
        (SELECT tti.is_memory_optimized FROM sys.table_types tti WHERE tti.type_table_object_id = i.object_id)=1)
        THEN ISNULL((SELECT ds.name FROM sys.data_spaces AS ds WHERE ds.type=''FX''), N'''')
        ELSE CASE WHEN ''FG''=dsi.type THEN dsi.name ELSE N'''' END
        END
       AS [FileGroup],
CASE WHEN ''FD''=dstbl.type THEN dstbl.name ELSE N'''' END AS [FileStreamFileGroup],
CASE WHEN ''PS''=dstbl.type THEN dstbl.name ELSE N'''' END AS [FileStreamPartitionScheme],
i.fill_factor AS [FillFactor],
ISNULL(i.filter_definition, N'''') AS [FilterDefinition],
i.ignore_dup_key AS [IgnoreDuplicateKeys],

        ISNULL(indexedpaths.name, N'''')
       AS [IndexedXmlPathName],
i.is_primary_key + 2*i.is_unique_constraint AS [IndexKeyType],
CAST(
          CASE i.type WHEN 1 THEN 0 WHEN 4 THEN 4
                      WHEN 3 THEN CASE xi.xml_index_type WHEN 0 THEN 2 WHEN 1 THEN 3 WHEN 2 THEN 7 WHEN 3 THEN 8 END
                      WHEN 4 THEN 4 WHEN 6 THEN 5 WHEN 7 THEN 6 WHEN 5 THEN 9 ELSE 1 END
        AS tinyint) AS [IndexType],
CAST(CASE i.index_id WHEN 1 THEN 1 ELSE 0 END AS bit) AS [IsClustered],
i.is_disabled AS [IsDisabled],
CAST(CASE WHEN filetableobj.object_id IS NULL THEN 0 ELSE 1 END AS bit) AS [IsFileTableDefined],
CAST(ISNULL(k.is_system_named, 0) AS bit) AS [IsSystemNamed],
CAST(OBJECTPROPERTY(i.object_id,N''IsMSShipped'') AS bit) AS [IsSystemObject],
i.is_unique AS [IsUnique],
CAST(ISNULL(si.level_1_grid,0) AS smallint) AS [Level1Grid],
CAST(ISNULL(si.level_2_grid,0) AS smallint) AS [Level2Grid],
CAST(ISNULL(si.level_3_grid,0) AS smallint) AS [Level3Grid],
CAST(ISNULL(si.level_4_grid,0) AS smallint) AS [Level4Grid],
ISNULL(s.no_recompute,0) AS [NoAutomaticRecomputation],
CAST(ISNULL(INDEXPROPERTY(i.object_id, i.name, N''IsPadIndex''), 0) AS bit) AS [PadIndex],
ISNULL(xi2.name, N'''') AS [ParentXmlIndex],
CASE WHEN ''PS''=dsi.type THEN dsi.name ELSE N'''' END AS [PartitionScheme],
case UPPER(ISNULL(xi.secondary_type,'''')) when ''P'' then 1 when ''V'' then 2 when ''R'' then 3 else 0 end AS [SecondaryXmlIndexType],
CAST(ISNULL(spi.spatial_index_type,0) AS tinyint) AS [SpatialIndexType],
CAST(ISNULL(INDEXPROPERTY(i.object_id, i.name, N''IsOptimizedForSequentialKey''), 0) AS bit) AS [IsOptimizedForSequentialKey],
CAST(
        case when ((SELECT  MAX(case when xml_compression = 1 then 1 else 0 end) FROM sys.partitions  WHERE object_id = (CASE WHEN i.type = 4 THEN allobj.object_id ELSE i.object_id END) AND index_id = (CASE WHEN i.type = 4 THEN 1 ELSE i.index_id END)) > 0) then 1 else 0 end
       AS bit) AS [HasXmlCompressedPartitions]
FROM
sys.tables AS tbl
INNER JOIN sys.indexes AS i ON (i.index_id > @_msparam_0 and i.is_hypothetical = @_msparam_1) AND (i.object_id=tbl.object_id)
LEFT OUTER JOIN sys.spatial_index_tessellations as si ON i.object_id = si.object_id and i.index_id = si.index_id
LEFT OUTER JOIN sys.hash_indexes AS hi ON i.object_id = hi.object_id AND i.index_id = hi.index_id
LEFT OUTER JOIN sys.data_spaces AS dsi ON dsi.data_space_id = i.data_space_id
LEFT OUTER JOIN sys.tables AS t ON t.object_id = i.object_id
LEFT OUTER JOIN sys.data_spaces AS dstbl ON dstbl.data_space_id = t.Filestream_data_space_id and (i.index_id < 2 or (i.type = 7 and i.index_id < 3))
LEFT OUTER JOIN sys.xml_indexes AS xi ON xi.object_id = i.object_id AND xi.index_id = i.index_id
LEFT OUTER JOIN sys.selective_xml_index_paths AS indexedpaths ON xi.object_id = indexedpaths.object_id AND xi.using_xml_index_id = indexedpaths.index_id AND xi.path_id = indexedpaths.path_id
LEFT OUTER JOIN sys.filetable_system_defined_objects AS filetableobj ON i.object_id = filetableobj.object_id
LEFT OUTER JOIN sys.key_constraints AS k ON k.parent_object_id = i.object_id AND k.unique_index_id = i.index_id
LEFT OUTER JOIN sys.stats AS s ON s.stats_id = i.index_id AND s.object_id = i.object_id
LEFT OUTER JOIN sys.xml_indexes AS xi2 ON xi2.object_id = xi.object_id AND xi2.index_id = xi.using_xml_index_id
LEFT OUTER JOIN sys.spatial_indexes AS spi ON i.object_id = spi.object_id and i.index_id = spi.index_id
LEFT OUTER JOIN sys.all_objects AS allobj ON 
        allobj.name = ''extended_index_'' + cast(i.object_id AS varchar) + ''_'' + cast(i.index_id AS varchar) AND allobj.type=''IT''

WHERE
(tbl.name=@_msparam_2 and SCHEMA_NAME(tbl.schema_id)=@_msparam_3)
ORDER BY
[Name] ASC',N'@_msparam_0 nvarchar(4000),@_msparam_1 nvarchar(4000),@_msparam_2 nvarchar(4000),@_msparam_3 nvarchar(4000)',@_msparam_0=N'0',@_msparam_1=N'0',@_msparam_2=N'MyTable',@_msparam_3=N'dbo'

Below is a minimal code to reproduce this:

using Microsoft.SqlServer.Management.Common;
using Microsoft.SqlServer.Management.Smo;

var connection = new ServerConnection("MyServer", "MyUser", "MyPassword");

var smo_server = new Server(connection);
smo_server.SetDefaultInitFields(true); // Set all init fields!

var smo_database = smo_server.Databases["MyDatabase"];

var options = new ScriptingOptions();

// PrefreshObjects! I have tried all these below (individually)...
smo_database.PrefetchObjects();
smo_database.PrefetchObjects(typeof(Table));
smo_database.PrefetchObjects(typeof(Table), options);

// Ignore this row! The first script that is run seems to have some overhead. That is okay. 
var pre_script = smo_database.Tables["MyFirstTable", "dbo"].Script();

// This scriping will be fast since everything is preloaded.
var fast_script = smo_database.Tables["MySecondTable", "dbo"].Script();

// Here is the problem! It will cause the slow SQL query that reads indices, primary key, etc.
var slow_script = smo_database.Tables["MySecondTable", "dbo"].Script(options);

I would believe that, at least, the smo_database.PrefetchObjects(typeof(Table), options); row would cause SMO to load everything it needs since I use the same options when scripting.

Or am I missing something? =)

Thank you!

shueybubbles commented 4 months ago

What version of SQL Server are you scripting, and what was the old version of SMO you used?

Are the script contents of slow_script and fast_script identical? It might be that over time the default ScriptingOptions used by the unparameterized Script() call has drifted from what happens with a default ScriptingOptions object as parameter, and now it's not including everything.

Usually, the optimal performance comes from using Scripter or ScriptMaker at the top level instead of calling Script on individual objects and by NOT calling SetDefaultInitFields or PrefetchObjects yourself. Scripter manages initialization fairly efficiently on its own.

There should be a clue as to what's triggering the extra queries if you run the app in a debugger - you'll some traces that log the name of the missing property. It could inform us as to what PrefetchObjects hasn't been updated to handle as new features have been added to the library. You can add your own event handler to SqlSmoObject.PropertyMissing too.

insikt-jonas commented 4 months ago

What version of SQL Server are you scripting, and what was the old version of SMO you used?

Are the script contents of slow_script and fast_script identical? It might be that over time the default ScriptingOptions used by the unparameterized Script() call has drifted from what happens with a default ScriptingOptions object as parameter, and now it's not including everything.

Usually, the optimal performance comes from using Scripter or ScriptMaker at the top level instead of calling Script on individual objects and by NOT calling SetDefaultInitFields or PrefetchObjects yourself. Scripter manages initialization fairly efficiently on its own.

There should be a clue as to what's triggering the extra queries if you run the app in a debugger - you'll some traces that log the name of the missing property. It could inform us as to what PrefetchObjects hasn't been updated to handle as new features have been added to the library. You can add your own event handler to SqlSmoObject.PropertyMissing too.

First of all, thank you for the answer. =)

I am using SQL Server 2022.

The resulting scripts (slow_script and fast_script) are identical.

I get the same behaviour when providing an UrlCollection containing all my tables to the Scripter and removing PrefetchObjects and SetDefaultInitFields. This is an excerpt of the output rows from the debugger that (i think) are causing the slow SQL being executed.

...
ScriptDB.Core.CLI Information: 0 : 2024-05-29T10:38:10.0893202+02:00 - get data for urn: Server[@Name='PF2RW869']/Database[@Name='MyDatabase']/Table[@Name='DataSources' and @Schema='dbo']/Index
ScriptDB.Core.CLI Information: 0 : 2024-05-29T10:38:10.1091571+02:00 - get data for urn: Server[@Name='PF2RW869']/Database[@Name='MyDatabase']/Table[@Name='DataSources' and @Schema='dbo']/Index
ScriptDB.Core.CLI Information: 0 : 2024-05-29T10:38:10.1160375+02:00 - get data for urn: Server[@Name='PF2RW869']/Database[@Name='MyDatabase']/Table[@Name='DataSourcePerspective' and @Schema='dbo']/Index
ScriptDB.Core.CLI Information: 0 : 2024-05-29T10:38:10.1375757+02:00 - get data for urn: Server[@Name='PF2RW869']/Database[@Name='MyDatabase']/Table[@Name='DataSourcePerspective' and @Schema='dbo']/Index
ScriptDB.Core.CLI Information: 0 : 2024-05-29T10:38:10.1430884+02:00 - get data for urn: Server[@Name='PF2RW869']/Database[@Name='MyDatabase']/Table[@Name='DomainProtocolType' and @Schema='dbo']/Index
ScriptDB.Core.CLI Information: 0 : 2024-05-29T10:38:10.1657333+02:00 - get data for urn: Server[@Name='PF2RW869']/Database[@Name='MyDatabase']/Table[@Name='DomainProtocolType' and @Schema='dbo']/Index
ScriptDB.Core.CLI Information: 0 : 2024-05-29T10:38:10.1708848+02:00 - get data for urn: Server[@Name='PF2RW869']/Database[@Name='MyDatabase']/Table[@Name='Domain' and @Schema='dbo']/Index
ScriptDB.Core.CLI Information: 0 : 2024-05-29T10:38:10.1905645+02:00 - get data for urn: Server[@Name='PF2RW869']/Database[@Name='MyDatabase']/Table[@Name='Domain' and @Schema='dbo']/Index
ScriptDB.Core.CLI Information: 0 : 2024-05-29T10:38:10.1956483+02:00 - get data for urn: Server[@Name='PF2RW869']/Database[@Name='MyDatabase']/Table[@Name='Language' and @Schema='dbo']/Index
ScriptDB.Core.CLI Information: 0 : 2024-05-29T10:38:10.2168706+02:00 - get data for urn: Server[@Name='PF2RW869']/Database[@Name='MyDatabase']/Table[@Name='Language' and @Schema='dbo']/Index
...

(Another weird thing is that the ".../Index" row is logged twice for every table)

I also tried using SqlSmoObject.PropertyMissing, but it is only called a few times when I run my app. Not when these rows are logged.

shueybubbles commented 4 months ago

You haven't enabled any options that would cause it to script indexes. Do the generated scripts include any indexes? I know of a bug that causes scripting to be slow when you explicitly set ScriptIndexes to true because there's a missing property in the the scripting init fields on the Index class. But you aren't hitting that bug. If you don't have any indexes in the output we'd need to set a breakpoint on the InitChildLevel call that's fetching Index to see why. I am busy on other things at the moment so I don't know when I would get to this.

insikt-jonas commented 4 months ago

In the excerpt above I get ScriptingOptions.Indexes = false since I have not set it. But in my real-world code, we are scripting Indexes.

I tried running ScriptingOptions.Indexes = false and found that...

But as said, in my case it does not really matter since we want to script indexes. The main problem is that it does not prefetch them so the scripting process is slow.

insikt-jonas commented 4 months ago

I have stepped through the code and what i found is that it is in the SfcChildrenDiscovery() function that the data is loaded:

I have verified that the individual table ID is in the list of the logged row when prefetched objects. So I don't understand why that data isn't reused.

Do you have any insights into this behaviour? =)

chadbaldwin commented 1 week ago

so it does not respect the ScriptingOptions.Indexes option

Just a side note...even if you have Indexes disabled, it will still script unique and clustered indexes if you have things like DriIndexes or DriConstraints enabled because those are necessary for data integrity.

Also, I'm just here to +1 this exact issue. I'm trying to script all tables out to files, but for whatever reason the query referenced in the original issue comment occasionally freezes. It is not blocked, it's not moving any data, the query just sits there not moving, not blocked, not blocking and for me can take anywhere from 300ms all the way up to 20+ minutes without any indicator as to why.

As for why it's still running that query despite using PrefetchObject, I don't have any useful input, but I am definitely running into this issue. In order to script out 3700 tables (with indexes, constraints, triggers, extended properties, etc) it takes me 7+ hours.

chadbaldwin commented 1 hour ago

One more comment to +1 @insikt-jonas latest comment. I stepped through the SMO code in the debugger and came to the same conclusion...something about SfcChildrenDiscovery() is ignoring the cached data from PrefetchObjects.

So even though it already prefetched this: Server[@Name='MyInstance']/Database[@Name='MyDatabase']/Table/Index

It still reaches out to the database to fetch this: Server[@Name='MyInstance']/Database[@Name='MyDatabase']/Table[@Name='MyTable' and @Schema='dbo']/Index

But I'm not seeing any debugger messages about MissingProperty.