spaghettidba / WorkloadTools

A collection of tools to collect, analyze and replay SQL Server workloads, on premises and in the cloud
MIT License
234 stars 53 forks source link

Replay - DB dropping into Suspect state #153

Open JakubKad opened 3 months ago

JakubKad commented 3 months ago

Hello,

I have a problem with replaying the workload. When I start my workload cmd it will initialize correctly and start the process. After 3 to 5 minutes I will get error and DB will drop into Suspect state (haven't altered files or configuration of any sorts, only JSON file):

Error - WorkloadTools.Consumer.Replay.ReplayWorker(Worker) : Error starting Worker System.Data.SqlClient.SqlException (0x80131904): Connection Timeout Expired. The timeout period elapsed during the post-login phase. The connection could have timed out while waiting for server to complete the login process and respond; Or it could have timed out while attempting to create multiple active connections. The duration spent while attempting to connect to this server was - [Pre-Login] initialization=1; handshake=3; [Login] initialization=0; authentication=1; [Post-Login] complete=14746; ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection) at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, TaskCompletionSource1 retry, DbConnectionOptions userOptions, DbConnectionInternal& connection) at System.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection) at System.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource1 retry, DbConnectionOptions userOptions) at System.Data.SqlClient.SqlConnection.TryOpenInner(TaskCompletionSource1 retry) at System.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource`1 retry) at System.Data.SqlClient.SqlConnection.Open() at WorkloadTools.Consumer.Replay.ReplayWorker.InitializeConnection(String applicationName) in C:\github\WorkloadTools\WorkloadTools\Consumer\Replay\ReplayWorker.cs:line 103 at WorkloadTools.Consumer.Replay.ReplayWorker.ExecuteCommand(ReplayCommand command, Int32 failRetryCount, Int32 timeoutRetryCount) in C:\github\WorkloadTools\WorkloadTools\Consumer\Replay\ReplayWorker.cs:line 237 at WorkloadTools.Consumer.Replay.ReplayWorker.ExecuteNextCommand() in C:\github\WorkloadTools\WorkloadTools\Consumer\Replay\ReplayWorker.cs:line 166 at WorkloadTools.Consumer.Replay.ReplayWorker.Run() in C:\github\WorkloadTools\WorkloadTools\Consumer\Replay\ReplayWorker.cs:line 140 ClientConnectionId:d4b0e4f3-f87a-4fd1-b571-2593de478dd3 Error Number:-2,State:0,Class:11

--------------------------------AND------------------------ Cannot open database "DB" requested by the login. The login failed. Login failed for user 'LOGIN'. at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection) at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, TaskCompletionSource1 retry, DbConnectionOptions userOptions, DbConnectionInternal& connection) at System.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection) at System.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource1 retry, DbConnectionOptions userOptions) at System.Data.SqlClient.SqlConnection.TryOpenInner(TaskCompletionSource1 retry) at System.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource`1 retry) at System.Data.SqlClient.SqlConnection.Open() at WorkloadTools.Consumer.Replay.ReplayWorker.InitializeConnection(String applicationName) in C:\github\WorkloadTools\WorkloadTools\Consumer\Replay\ReplayWorker.cs:line 103 at WorkloadTools.Consumer.Replay.ReplayWorker.ExecuteCommand(ReplayCommand command, Int32 failRetryCount, Int32 timeoutRetryCount) in C:\github\WorkloadTools\WorkloadTools\Consumer\Replay\ReplayWorker.cs:line 237 at WorkloadTools.Consumer.Replay.ReplayWorker.ExecuteNextCommand() in C:\github\WorkloadTools\WorkloadTools\Consumer\Replay\ReplayWorker.cs:line 166 at WorkloadTools.Consumer.Replay.ReplayWorker.Run() in C:\github\WorkloadTools\WorkloadTools\Consumer\Replay\ReplayWorker.cs:line 140 ClientConnectionId:37bde663-5b4c-47af-b719-1cd89d5cd6ec Error Number:4060,State:1,Class:11


I am running newest version of WorkloadTools (1.7.1). Got SQL Server version 16.0.1115.1 and SMSS version 20.1.10.0. I am doing everything as always (older versions of all components) and can't get past these. I am using Windows Authentication, tried switching between Optional and Mandatory Encryption and unchecking Trusted Certificate box (and checking it). Always am db_owner.

JSON: { "Controller": {

    "Listener":
    {

        "__type": "FileWorkloadListener",
        "Source": "C:\\path.sqlite",

        "SynchronizationMode": "false"

    },

    "Consumers":
    [

        {
            "__type": "ReplayConsumer",
            "ConnectionInfo": 
            {
                "ServerName": "(local)",
                "DatabaseName": "DB",
                "UseWindowsAuthentication": true (I was just trying few things to go with)
            }

        }
    ]
}

}

spaghettidba commented 3 months ago

Hi there, I would look at the SQL Server log to see what caused the database to go to the suspect state. If WorkloadTools is the only application hitting the database, it's probably something that you have in the workload captured with the .sqlite file. I would try to open the file with something like DB Browser for Sqlite and see what command caused the database to go suspect, based on the error messaged from ERRORLOG. Does that make sense?

JakubKad commented 3 months ago

Yeah, found out. When I start Analysis and Replay consumer it will display these errors. When starting only Replay it went well. Although only after setting connection for remote query timeout to 0. Tbf, this machine I am using has a better hardware than the one before, so it seems a bit off.