olahallengren / sql-server-maintenance-solution

SQL Server Maintenance Solution
https://ola.hallengren.com
MIT License
2.91k stars 756 forks source link

Index/Stats maintenance jobs failure on AG with "Read-Intent" secondary nodes #818

Closed misirajudeen closed 1 week ago

misirajudeen commented 2 months ago

Its on SQL Server 2019 - CU24 - Maintenance solution using Ola's free solution. Index and Statistics Jobs were going well in both Primary/Secondary AG nodes. When we configured AG with "Read-Only Intent" for managing the Secondary Read connections, Jobs started to fail in the Secondary nodes. Is this expected behavior? or any workarounds or parameters to be passed in the script executions?

Stiffboard commented 2 weeks ago

Index job is a read/write operation so it shouldn't run. But it shouldn't fail either. What is the error message?

misirajudeen commented 2 weeks ago

Error message: Unable to connect to SQL Server '(local)'. The step failed. After changing from "Read-Only Intent" to "Read Only", the job is working like before.

Stiffboard commented 2 weeks ago

Strange, from error message seems like sql agent account don't have access? Or can't connect...?

misirajudeen commented 2 weeks ago

It's the same job, same account ( full permission- sysadmin ) succeeding in "Read Only" Secondary AG nodes but failing in Read-Only Intent" on the Secondary AG nodes.

Does the code handle this scenario?

olahallengren commented 1 week ago

Can you reproduce this issue in a SSMS query window?

misirajudeen commented 1 week ago

from the Secondary node configured for Read-intent only:

1) Executed the DB Job via SSMS TSQL. The Job got triggered but it failed with the same error as shared earlier. 2) Executed the DB Job Commands in TSQL and it ended with the below error:

Error: Msg 978, Level 14, State 1, Line 1 The target database ('DBA') is in an availability group and is currently accessible for connections when the application intent is set to read only. For more information about application intent, see SQL Server Books Online. Msg 2812, Level 16, State 62, Line 5 Could not find stored procedure 'dbo.IndexOptimize'.

TSQL/Command as fetched from the DB Job executed in SSMS Query window: USE DBA GO -- Included Partition Parameter -- EXECUTE dbo.IndexOptimize @Databases = 'USER_DATABASES', @FragmentationLow = NULL, @FragmentationMedium = 'INDEX_REORGANIZE,INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE', @FragmentationHigh = 'INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE', @FragmentationLevel1 = 5, @FragmentationLevel2 = 30, @PartitionLevel = 'Y'

misirajudeen commented 1 week ago

Executed the TSQL via "ApplicationIntent=ReadOnly" from SSMS connection and below is the output and it succeeded this time. But the DB job execution is not controlled with the above said connection parameter. Should it be handled explicitly?

SSMS window message as follows: (Have replaced the Server, AG and Database name but other details remains the same.)

Date and time: 2024-11-10 16:11:18 Server: ServerNameChanged Version: 15.0.4390.2 Edition: Enterprise Edition: Core-based Licensing (64-bit) Platform: Windows Procedure: [DBA].[dbo].[IndexOptimize] Parameters: @Databases = 'USER_DATABASES', @FragmentationLow = NULL, @FragmentationMedium = 'INDEX_REORGANIZE,INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE', @FragmentationHigh = 'INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE', @FragmentationLevel1 = 5, @FragmentationLevel2 = 30, @MinNumberOfPages = 1000, @MaxNumberOfPages = NULL, @SortInTempdb = 'N', @MaxDOP = NULL, @FillFactor = NULL, @PadIndex = NULL, @LOBCompaction = 'Y', @UpdateStatistics = NULL, @OnlyModifiedStatistics = 'N', @StatisticsModificationLevel = NULL, @StatisticsSample = NULL, @StatisticsResample = 'N', @PartitionLevel = 'Y', @MSShippedObjects = 'N', @Indexes = NULL, @TimeLimit = NULL, @Delay = NULL, @WaitAtLowPriorityMaxDuration = NULL, @WaitAtLowPriorityAbortAfterWait = NULL, @Resumable = 'N', @AvailabilityGroups = NULL, @LockTimeout = NULL, @LockMessageSeverity = 16, @StringDelimiter = ',', @DatabaseOrder = NULL, @DatabasesInParallel = 'N', @ExecuteAsUser = NULL, @LogToTable = 'N', @Execute = 'Y' Version: 2022-01-02 13:58:13 Source: https://ola.hallengren.com

Date and time: 2024-11-10 16:11:19 Database: [DBNameChanged] State: ONLINE Standby: No Updateability: READ_WRITE User access: MULTI_USER Recovery model: FULL Is accessible: Yes Availability group: AGNameChanged Availability group role: SECONDARY

Date and time: 2024-11-10 16:11:36 Completion time: 2024-11-10T16:11:37.0759283+03:00

Stiffboard commented 1 week ago

Looks like you are running stored procedures in database that are in AG. You should have OH scripts in a single database on each node, not replicated...or in master.

misirajudeen commented 1 week ago

Yes, in our setup, OH scripts are in AG Database only (which only has DB Administration related tables and scripts).

We dint face any issues until "read-only Intent" been set in our AG, so, we continued in both Active and Passive nodes with AG configured Database.

Is it the pre-requisites to have OH scripts in non AG Database? Couldn't find it in documentation and can you share it please?

Thank you.

olahallengren commented 1 week ago

@misirajudeen, the stored procedures have been designed to run in a standalone database on each replica.

For example the logging to table will not work on a readable secondary replica.

However, it should work with an availability group with readable secondary replicas, under these conditions:

Stiffboard commented 1 week ago

That is not a preferred setup. Jobs will not be able to log to table on secondary. Weird complexity when there are several AGs. You just shouldn't have the utility database in AG.

Read-only intent...you should read up on that, used in connection strings. Have you setup routing configuration in AG?

Just remove db from AG and bring secondary database online.

olahallengren commented 1 week ago

I agree with @Stiffboard. I recommend that you create the objects in a standalone database on each replica.

misirajudeen commented 1 week ago

Thank you @olahallengren and @Stiffboard. Option i was thinking - workaround while using "Read-intent only" is that to setup a prior DB Job step to validate if PRIMARY node and then go to next step of execution of OH script.

So, the takeaway of this reported issue is as follows and has to be adhered.

1) OH scripts - highly recommended to be created in a standalone database on each replica. 2) OH scripts - should work with an availability group with readable secondary replicas, under these conditions: a) You are not logging to table b) You are not using "Read-intent only" (SQL Server Agent Transact-SQL jobs do not support setting "ApplicationIntent=ReadOnly")

Thank you.

olahallengren commented 1 week ago

Option i was thinking - workaround while using "Read-intent only" is that to setup a prior DB Job step to validate if PRIMARY node and then go to next step of execution of OH script.

This is not a good workaround as there might be other availability groups that are in other states. There are also the system databases.

I would make it a standalone database.

misirajudeen commented 1 week ago

Thank you @olahallengren . Going forward, will have the OH scripts in standalone user Database or in master.

Is this already mentioned in Pre-requisites of OH scripts implementation? Sorry, I couldn't find them. Will have this saved.

olahallengren commented 1 week ago

This sounds very good.

The recommendation is currently not documented, but I will add that.

misirajudeen commented 1 week ago

I will have the Index Maintenance OH script deployed in standalone database in "Read-only Intent" configured AG secondary node, execute it as DB Job and will update this thread by tomorrow for closure.

Hope it is OK.

olahallengren commented 1 week ago

It should be a standalone database, that is not in an availability group.

olahallengren commented 1 week ago

I have updated the documentation: "It should be a standalone database. That is a database that is not in an availability group or mirrored." https://ola.hallengren.com/frequently-asked-questions.html

misirajudeen commented 1 week ago

I have validated the "Read-Intent Only" Secondary AG node as placing the OH Index Maintenance Script in a standalone Database (master). Setup execution from a DB Job. It went good and below is the log. Thank you @olahallengren and team for your extended support.

I am closing this case now.

DB execution Log (changed Server and DB names): Date 11/12/2024 9:23:02 AM Log Job History (DBA_Index_Maintenance) Step ID 1 Server ServerName Job Name DBA_Index_Maintenance Step Name Index Duration 00:00:19 Sql Severity 0 Sql Message ID 50000 Operator Emailed
Operator Net sent
Operator Paged
Retries Attempted 0

Message Executed as user: DomainUser. ...e 50000) Server: SQLServerName[SQLSTATE 01000] (Message 50000) Version: 15.0.4390.2 [SQLSTATE 01000] (Message 50000) Edition: Enterprise Edition: Core-based Licensing (64-bit) [SQLSTATE 01000] (Message 50000) Platform: Windows [SQLSTATE 01000] (Message 50000) Procedure: [master].[dbo].[IndexOptimize] [SQLSTATE 01000] (Message 50000) Parameters: @Databases = 'USER_DATABASES', @FragmentationLow = NULL, @FragmentationMedium = 'INDEX_REORGANIZE,INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE', @FragmentationHigh = 'INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE', @FragmentationLevel1 = 5, @FragmentationLevel2 = 30, @MinNumberOfPages = 1000, @MaxNumberOfPages = NULL, @SortInTempdb = 'N', @MaxDOP = NULL, @FillFactor = NULL, @PadIndex = NULL, @LOBCompaction = 'Y', @UpdateStatistics = NULL, @OnlyModifiedStatistics = 'N', @StatisticsModificationLevel = NULL, @StatisticsSample = NULL, @StatisticsResample = 'N', @PartitionLevel = 'Y', @MSShippedObjects = 'N', @Indexes = NULL, @TimeLimit = NULL, @Delay = NULL, @WaitAtLowPriorityMaxDuration = NULL, @WaitAtLowPriorityAbortAfterWait = NULL, @Resumable = 'N', @AvailabilityGroups = NULL, @LockTimeout = NULL, @LockMessageSeverity = 16, @StringDelimiter = ',', @DatabaseOrder = NULL, @DatabasesInParallel = 'N', @ExecuteAsUser = NULL, @LogToTable = 'N', @Execute = 'Y' [SQLSTATE 01000] (Message 50000) Version: 2022-01-02 13:58:13 [SQLSTATE 01000] (Message 50000) Source: https://ola.hallengren.com [SQLSTATE 01000] (Message 50000) [SQLSTATE 01000] (Message 50000) Date and time: 2024-11-12 09:23:03 [SQLSTATE 01000] (Message 50000) Database: [UserDB1] [SQLSTATE 01000] (Message 50000) State: ONLINE [SQLSTATE 01000] (Message 50000) Standby: No [SQLSTATE 01000] (Message 50000) Updateability: READ_WRITE [SQLSTATE 01000] (Message 50000) User access: MULTI_USER [SQLSTATE 01000] (Message 50000) Recovery model: FULL [SQLSTATE 01000] (Message 50000) Is accessible: Yes [SQLSTATE 01000] (Message 50000) Availability group: AG1 [SQLSTATE 01000] (Message 50000) Availability group role: SECONDARY [SQLSTATE 01000] (Message 50000) [SQLSTATE 01000] (Message 50000) Date and time: 2024-11-12 09:23:03 [SQLSTATE 01000] (Message 50000) Database: [UserDB2] [SQLSTATE 01000] (Message 50000) State: ONLINE [SQLSTATE 01000] (Message 50000) Standby: No [SQLSTATE 01000] (Message 50000) Updateability: READ_WRITE [SQLSTATE 01000] (Message 50000) User access: MULTI_USER [SQLSTATE 01000] (Message 50000) Recovery model: FULL [SQLSTATE 01000] (Message 50000) Is accessible: Yes [SQLSTATE 01000] (Message 50000) Availability group: AG1 [SQLSTATE 01000] (Message 50000) Availability group role: SECONDARY [SQLSTATE 01000] (Message 50000) [SQLSTATE 01000] (Message 50000) Date and time: 2024-11-12 09:23:04 [SQLSTATE 01000] (Message 50000) Database: [DBA] [SQLSTATE 01000] (Message 50000) State: ONLINE [SQLSTATE 01000] (Message 50000) Standby: No [SQLSTATE 01000] (Message 50000) Updateability: READ_WRITE [SQLSTATE 01000] (Message 50000) User access: MULTI_USER [SQLSTATE 01000] (Message 50000) Recovery model: FULL [SQLSTATE 01000] (Message 50000) Is accessible: Yes [SQLSTATE 01000] (Message 50000) Availability group: AG1 [SQLSTATE 01000] (Message 50000) Availability group role: SECONDARY [SQLSTATE 01000] (Message 50000) [SQLSTATE 01000] (Message 50000) Date and time: 2024-11-12 09:23:05 [SQLSTATE 01000] (Message 50000) Database: [UserDB4] [SQLSTATE 01000] (Message 50000) State: ONLINE [SQLSTATE 01000] (Message 50000) Standby: No [SQLSTATE 01000] (Message 50000) Updateability: READ_WRITE [SQLSTATE 01000] (Message 50000) User access: MULTI_USER [SQLSTATE 01000] (Message 50000) Recovery model: FULL [SQLSTATE 01000] (Message 50000) Is accessible: Yes [SQLSTATE 01000] (Message 50000) Availability group: AG2 [SQLSTATE 01000] (Message 50000) Availability group role: SECONDARY [SQLSTATE 01000] (Me... The step succeeded.