dataplat / dbatools

🚀 SQL Server automation and instance migrations have never been safer, faster or freer
https://dbatools.io
MIT License
2.42k stars 792 forks source link

Get-DbaDbStoredProcedure seems to hang after first database #9418

Closed ReeceGoding closed 1 month ago

ReeceGoding commented 1 month ago

Verified issue does not already exist?

I have searched and found no existing issue

What error did you receive?

When I run

Get-DbaDbStoredProcedure -SqlInstance $myIns -ExcludeSystemSp -EnableException

It gets all of the procedures in master without issue. After that, nothing happens. sp_who2 shows that I'm still connected to master but sp_whoisactive shows nothing.

How can I debug this?

Steps to Reproduce

As above.

Please confirm that you are running the most recent version of dbatools

Yeah, I updated it only a couple of weeks ago.

Other details or mentions

No response

What PowerShell host was used when producing this error

Windows PowerShell (powershell.exe)

PowerShell Host Version

5.1.19041.4522

SQL Server Edition and Build number

15.0.4375.4

.NET Framework Version

Very recent.

ReeceGoding commented 1 month ago

Seems like the pattern is that it takes very long to go from one database to another. I have a database with only one non-system stored procedure. It grabbed the stored procedure from there 7 minutes ago, but sp_whoisactive shows that it's still hitting that database.

ReeceGoding commented 1 month ago

My best guess is that it's grabbing all of the stored procedures and filtering the system ones out only after fully grabbing them. It's no wonder that it therefore seems slow. Databases come with a lot of system stored procedures.

Surely there is a lazier way to filter out system stored procedures?

niphlod commented 1 month ago

it's correct, but we just use the only way SMO has to list procedures https://github.com/dataplat/dbatools/blob/5c29bc705dd2ef1832af2127590e01bcb9ff70b2/public/Get-DbaDbStoredProcedure.ps1#L168 . I don't think we can fix it or make it "faster"

ReeceGoding commented 1 month ago

So, for example, we can't query sys.procedures and just grab those by name?

niphlod commented 1 month ago

we could, but WHY ? to rewrite ourselves a faster SMO ?

ReeceGoding commented 1 month ago

What's the disadvantage of being faster?

ReeceGoding commented 1 month ago

On second thought, I can just pipe from Find-DbaStoredProcedure and get what I want. It seems like a silly hack, but I can live with it.

niphlod commented 1 month ago

What's the disadvantage of being faster?

SMO translates what's needed when changes are introduced between versions. Get-DbadbStoredProcedure returns not just the name, but a wealth of additional info (and rich objects, with methods like script, and so on). Reimplementing it and asking the same properties would probably result in .... the same exact query SMO does (so nothing faster than the actual). So, each own mileage may vary, but spending time reimplementing SMO is not really a safe bet, and the end result may not be speedier than SMO's defaults. I may take a point if we wanna introduce some kind of pre-filtering, but even then SMOs methods may not expose a "give me this and that sproc" without looping internally. We do something like that for Get-DbaDatabase , but it's more of a "fortunate coincidence" rather than a thing that can implemented globally.

As you already found out, to return a smaller object (and faster) there's Find-Dba which .... does a custom query https://github.com/dataplat/dbatools/blob/master/public/Find-DbaStoredProcedure.ps1#L89 . But it enumerates in the same way, so if you're getting slow results for Get-Dba, I don't expect find to be faster https://github.com/dataplat/dbatools/blob/master/public/Find-DbaStoredProcedure.ps1#L142

ReeceGoding commented 1 month ago

I promise. Piping from Find-Dba to Get-Dba has been orders of magnitude faster than relying on the -ExcludeSystemSp parameter of Get-Dba.

niphlod commented 1 month ago

even on a fresh powershell process ? SMO caches results, so the first time enumerating procedures may take a lot more than the second time

ReeceGoding commented 1 month ago

I'll try when I get a chance, but the difference is so big that I find it impossible to suspect it's just caching. It went from being unable to finish five of my databases over a period of hours to doing all of them in far less time.

niphlod commented 1 month ago

the only other different thing is .... did you try withOUT -ExcludeSystemSp ? Find- doesn't consider the underlying property, Get- does when you use that parameter.

ReeceGoding commented 1 month ago

In my fast version that used Find, I didn't use the -ExcludeSystemSp parameter.

ReeceGoding commented 1 month ago

@niphlod I've taken on your suggestions and tested them out. I am completely convinced that Find-DbaStoredProcedure is much faster than Get-DbaDbStoredProcedure. With Find piped in to Get, I can export every non-system stored procedure on a big set of databases in just under an hour. With Get alone, it takes five minutes before it even does the first procedure on a much smaller set of databases and I've seen it take 20 minutes before it goes from exporting the last non-system procedure in one database to exporting the first in the next. It's night and day. Get alone takes well over three hours to do a much smaller set of databases than what I can do with Find piped in to Get in under an hour.

ReeceGoding commented 1 month ago

Having looked at the code for both Find and Get, I'm convinced that it's something to do with whether or not you pass in -ExcludeSystemSp. I don't see why finding the non-system procedures with Find and then not passing -ExcludeSystemSp in to Get would be faster, but I am entirely certain that it is.

Is foreach ($proc in $procs) just really slow on big sets? I have no other guesses.

ReeceGoding commented 1 month ago

The pattern, if there's any, is that going through a database without Find takes five minutes if it has no non-system stored procedures. Find piped to Get seems to make those cases instant.

ReeceGoding commented 1 month ago

I think I finally get it! Find-DbaStoredProcedure iterates through the rows generated by a query that joins to sys.procedures. That view doesn't include system views. I can only assume that's somehow faster than what Get-DbaDbStoredProcedure does with the foreach ($proc in $procs) loop. Running Test-Bound over one-thousand times per database can't be fast. Why is that even in the loop anyway? Its value is know as soon as you call Get-DbaDbStoredProcedure.