amachanic / sp_whoisactive

sp_whoisactive
GNU General Public License v3.0
1.13k stars 281 forks source link

Invalid SPID nnn specified when logging to a table #107

Closed genetsmartx closed 11 months ago

genetsmartx commented 11 months ago

This seems to happen when the server is under some pressure. It is an intermittent issue.

Doing the call with retention as specified here https://www.brentozar.com/archive/2016/07/logging-activity-using-sp_whoisactive-take-2/#comment-3680805

I am on version 12 of sp_WhoIsActive.

--collect activity into logging table
DECLARE @destination_table VARCHAR(500) = 'WhoIsActive';
EXEC hcvadmin.dbo.sp_WhoIsActive @get_transaction_info = 1,
                        @get_outer_command = 1,
                        @get_avg_time = 1,
                        @get_plans = 1,
                        @destination_table = @destination_table;

--purge older data
SET @SQL
    = 'DELETE FROM ' + @destination_table + ' WHERE collection_time < DATEADD(day, -' + CAST(@retention AS VARCHAR(10))
      + ', GETDATE());';
EXEC ( @SQL );

I looked at the proc sp_WhoIsActive and I see it is referencing the INPUTBUFFER on line 4195

It has a try catch but nothing in the CATCH!

From this post something similar is happening and they suggest this for the catch:

_I have tried your procs to replace my more verbose version to see how yours works, but I have seemed to stumbled upon a race condition. It would throw an error ever so occasionally. When I finally saw the message (some happened in the middle of the night, and my Agent logs are only good for about a half hour), I think I tracked it down. The message was “Invalid SPID 137 specified. [SQLSTATE 42000] (Error 7955). The step failed.”

The execution of dbcc inputbuffer can throw an error if the SPID has terminated by the time it runs. So, I put it in a Begin Try/End Try/Begin Catch/End Catch block. In the Catch portion, I simply chose to insert a dummy record in it’s place. Insert Into #InputBuffer(EventType,Params,EventInfo) Values(N’Invalid SPID’, 0, ErrorMessage())

https://am2.co/2017/12/alerting-sql-server-blocking/#:~:text=I%20have%20tried,SPID%E2%80%99%2C%200%2C%20Error_Message())

erikdarlingdata commented 11 months ago

Hi Gene,

The whole point of the empty catch block is to discard any errors like the one you're seeing:

This throw an error: DBCC INPUTBUFFER(32767);

But this doesn't:

BEGIN TRY
    DBCC INPUTBUFFER(32767);
END TRY;
BEGIN CATCH
END CATCH;

So it seems odd that it would still bubble up at all. Are you sure you've got this tracked down correctly?

Thanks!

genetsmartx commented 11 months ago

Hey Erik, Thanks for the explanation on the CATCH. I get it. Most of the time the call works to log to the table. During the early morning hours we go through some heavy data loading and monitor for thread pool, disk latency etc.

But at the time of some of the failures/errors, we were sitting pretty. Batch requests down. Memory and CPU at baseline.

The only thing that would cause this error is the INPUTBUFFER call. Not sure why it surfaces but that seems like the cause inside sp_WhoIsActive.

We run it every 20 seconds so perhaps the DELETE is running when the next call starts. But the job should finish the INSERT and the DELETE and end solidly. Then the new iteration should start.

I am willing to take any suggestions to "track it down correctly"

I think maybe wrapping the call to log it in a transaction might help.

erikdarlingdata commented 11 months ago

Hi Gene,

Sorry for over-explaining. You seemed surprised by the empty catch block, so I figured I'd throw it out there.

This is a tough one to take on, because a lot of issues that get reported as bugs here come down to a variety of local factors that are impossible to reproduce outside of a particular user environment/setup.

If you come up with a code fix on your end that solves the problem, you can submit a PR for review. I don't have the bandwidth to either try to reproduce the issue, or make shot-in-the-dark code changes and await feedback.

Thanks, Erik

genetsmartx commented 11 months ago

Yes I was indeed surprised. Thank you for that Erik. I understand completely. Thank you. I will see if I can solve it. Will submit a PR if I find an answer. Thank you.