spaghettidba / WorkloadTools

A collection of tools to collect, analyze and replay SQL Server workloads, on premises and in the cloud
MIT License
230 stars 51 forks source link

Added some improvements and fixes #119

Closed mcflyamorim closed 1 year ago

mcflyamorim commented 1 year ago

• When an sp_reset_connection RPC event is found, SqlWorkload will call a connection close and open to force the connection to get back to connection pool and reset it so that it's clean for the next event. This is required to guarantee that any user objects (like temporary tables) and user settings are reset before a new event is executed. • Login event is also captured by default to be able to identify physical (non-pooled) connections and properly close/refresh the connection. • This improvement should reduce the number of failed events in a replay.

Fix issue with incorrect connection reset after a failed event replay: • If an event fail, SqlWorkload was calling a ClearPool to clean-up the connection. This may cause next events to fail as they may rely on objects created on the connection.

Changed SqlWorkload to work on sql_batch_starting and rpc_starting commands to be able to run the commands in the correct order and preserve executions that may take more time than the workload record. • The current execution order of events is based on sql_batch_completed and rpc_completed time which may cause some events to be delayed as they will only start at completed time. For instance: o Consider you ran a record from 9AM to 11AM, if a command started to run a 9:10AM and completed at 10:30, it would be recorded as it was started at 10:30. In a worse scenario, if the command completes after the recording time, let’s say, at 11:30, you would not capture the event at all, as only completed events are captured. This approach fails to replay commands taking more time to run, which are critical for a performance testing. o In this change, I’m recording the events by using the sql_batch_starting and rpc_starting events and then updating the performance metrics (CPU, reads, writes and duration) when the completed event is available. This means that in some scenarios, you may have events without the performance metrics as if you stop the record before the event to complete, the completed event won’t be available. Hopefully this won’t be a big deal as you can always replay it to capture the performance statistics.

Fixed issue with events execution order: • Added an ORDER BY event_sequence ASC when reading list of events to guarantee the correct order.

Added list of commands to be skipped. • Some commands may not be possible to replay. An example is the insert bulk command. Insert bulk indicates to the SQL Server that the TDS stream contains bulk data for import., since the data stream is not output in the events it cannot be replayed. • The following commands were added to the list of commands to skip: “KILL”, “ALTER EVENT SESSION”, “fn_xe_file_target_read_file”, “fn_trace_getinfo” and “insert bulk”.