BrentOzarULTD / SQL-Server-First-Responder-Kit

sp_Blitz, sp_BlitzCache, sp_BlitzFirst, sp_BlitzIndex, and other SQL Server scripts for health checks and performance tuning.
http://FirstResponderKit.org
Other
3.37k stars 996 forks source link

Low priority long term plan: add new script to analyze index usage for SaaS clients #3183

Closed erikdarlingdata closed 4 months ago

erikdarlingdata commented 1 year ago

Version of the script 8.11

What is the current behavior? When you specify @GetAllDatabases = 1 and @TableName = 'something', a few weird things happen:

If the current behavior is a bug, please provide the steps to reproduce. On my server, I have three copies of the StackOverflow database, each with the Posts table in them:

image

When I run this:

EXEC dbo.sp_BlitzIndex 
    @TableName = N'Posts',
    @GetAllDatabases = 1,
    @Debug = 1;

The script bails at AdventureWorks because Posts doesn't exist there:

image

When I run this:

EXEC dbo.sp_BlitzIndex 
    @TableName = N'Posts',
    @GetAllDatabases = 1,
    @IgnoreDatabases = N'AdventureWorks, Crap',
    @Debug = 1;

The output ends up looking like this (edited to show issues):

image

Under the heading for StackOverflow2013_Clean are indexes for StackOverflow2013. This also highlights the fact that the StackOverflow database doesn't produce any results the final output even though data is logged in intermediate temp tables (shown via debug):

image

As well as:

image

And they do show up when I specify the StackOverflow database:

EXEC dbo.sp_BlitzIndex 
    @TableName = N'Posts',
    @DatabaseName = N'StackOverflow',
    @Debug = 1;

Like so:

image

What is the expected behavior?

Which versions of SQL Server and which OS are affected by this issue? Did this work in previous versions of our procedures? I've only tested this on supported versions of SQL Server: (2016+, Managed Instance). Windows is whatever.

erikdarlingdata commented 1 year ago

Just to set expectations on this, I might not get to it quickly. If anyone feels like picking it up, leave a comment so I can assign it to you. Otherwise I'll work on this as time allows.

BrentOzar commented 1 year ago

Yeah, I'm really intrigued by this. I think I'm going to do a couple of streaming sessions working on the FRK this week.

erikdarlingdata commented 1 year ago

@BrentOzar I wish nothing but the best for you both

BrentOzar commented 1 year ago

This is interesting. If you specify @TableName, then the @Mode parameter doesn't matter. My first thought on the use case was, "He wants to see Mode 2 and list all of the indexes for that object across all the databases." However, that doesn't work:

sp_BlitzIndex @GetAllDatabases = 1, @TableName = 'Posts', @Mode = 2

Produces:

ChicagoParkingTickets database failed to process. Setting the @Mode doesn't change behavior if you supply @TableName. Use default @Mode=0 to see table detail.

But in Mode 0, @GetAllDatabases doesn't really make sense because most of the result sets don't have a database name: Screenshot 2022-12-13 at 4 24 03 AM

So lemme jump back and ask - what was the use case for this? I can totally see it being used in Mode 2 or 3, and in that case, the fix is different. If you want it to work in Mode 0, that means adding database name columns to a bunch of the result sets, but also adding a sort order to each result set when @GetAllDatabases = 1. For example, if you're looking at indexes that exist, I don't think you want them sorted by database name first - probably last, because you wanna look at each database to see if the indexes are different. (Or heaven forbid, summing the numbers, which is way beyond my pay grade.)

erikdarlingdata commented 1 year ago

@BrentOzar good question! So the use case for me was a someone with a database-per-client, and one particular main table that drives a bunch of critical processes. I wanted to focus changes on that one table. Not to make all the indexes "the same" just to clean up unused and overlapping ones in each database.

The table mode output certainly isn't ideal, both because the database name is missing, and you're stuck figuring out unused/overlapping/other stuff manually. I think that would also be the case in Mode 2, though perhaps easier to manage with the output.

So yeah, whatever mode between 0/4 or 2 would be fine. Whatever is easiest. I do not need missing index requests for my purposes, here. This is pure clean up work.

BrentOzar commented 1 year ago

We've needed to do this so many times! Time to roll up my sleeves and build something serious, something that'll also work across servers because that's a common issue too. This will be a new stored proc, pared down for this particular use case, and won't be part of the "core" set since it's pretty specialized.

Thinking out loud about the use cases:

Screenshot 2022-12-14 at 5 05 29 AM

(And in case anybody else is reading, SaaS could also refer to replication, and AGs could also refer to log shipping.)

Out of scope:

Initial version:

Overall version (not called for one table) output would look like:

With that in mind, when you call it for a single table, then you already know you have variances, and you probably wanna know: Result set 1: overall index inventory for this table

Result set 2: details

For "Index present, yes/no" - there are two use cases:

Stopping here in the thought process, but will come back to it. If you see any holes or wanna add anything, feel free. I'll probably think through this for a few days. I thought about posting a blog post asking folks for a design review, but everybody's just gonna ask for free candy.

erikdarlingdata commented 1 year ago

Oh boy! It's a baby sp_BlitzaaS! I have a call starting shortly, but I'll give this a full read and big think.