dsccommunity / SqlServerDsc

This module contains DSC resources for deployment and configuration of Microsoft SQL Server.
MIT License
359 stars 224 forks source link

xSQLAOGroupEnsure: NT AUTHORITY\System Permissions #235

Closed randomnote1 closed 7 years ago

randomnote1 commented 7 years ago

Please provide the following information regarding your issue (place N/A if certain fields don't apply in your case:

---- DELETE THIS LINE AND ABOVE ----

Details of the scenario you try and problem that is occurring:

The DSC configuration that is using the resource (as detailed as possible):

Version of the Operating System, SQL Server and PowerShell the DSC Target Node is running:

Version of the DSC module you're using, or 'dev' if you're using current dev branch:

---- DELETE THIS LINE AND BELOW ----

Your feedback and support is greatly appreciated.

If you are able to resolve this issue or add new features, you may submit a Pull Request against this project. Please see the Contribution Guideliness for information on how to contribute.

image
randomnote1 commented 7 years ago

Please feel free to assign this to me as I am beginning work on it.

Thanks!

johlju commented 7 years ago

Isn't the account NT AUTHORITY\SYSTEM used as the health detection account? See this support article https://support.microsoft.com/en-us/kb/2833707

But the permission should not be set by this resource. Wouldn't be better to use xSQLServerPermission instead?

randomnote1 commented 7 years ago

The cluster service runs as NT AUTHORITY\SYSTEM and the cluster does need access to health related information inside of SQL. However, if you grant the cluster service itself (NT SERVICE\CLUSSvc), it allows NT AUTHORITY\SYSTEM to log into SQL only when it is running in the context of the cluster service.

Currently the resource over-permissions NT AUTHORITY\SYSTEM in order to get the AGs configured. My thought process is that because these permissions are required only for the AGs, it makes sense to have the xSQLAOGroupEnsure resource set the permissions. Otherwise we are expecting people to know that additional permissions are required.

Either way I intend to have xSQLAOGroupEnsure check and verify the permissions are present prior to proceeding.

johlju commented 7 years ago

I'm all for low privilege, so using CLUSSvc seems to be a better approach. And I agree that setting that account could be done using this resource since that is not obvious for users. But should this resource also be able to handle different SQL Service accounts on each replica?

@randomnote1 Please take a look att issue #236 and #237 if they can help anything.

randomnote1 commented 7 years ago

I think it should. However, I've been told AGs work best if all the instances involved in the AGs use the same service accounts.

@nabrond - do you have any thoughts?

johlju commented 7 years ago

When I read up on this I did not see any issues like that. But did not look for it etiher so...

What I did find was this

If two server instances run as different accounts, the system administrator must use the CREATE LOGIN Transact-SQL statement to create a login for the startup service account of the remote instance for each server instance. https://msdn.microsoft.com/en-us/library/ms366346.aspx#Anchor_0

randomnote1 commented 7 years ago

Yeah, that's how it works with Database Mirroring, so that makes sense to me.

nabrond commented 7 years ago

@randomnote1 If you are using an Availability Group Listener and want to use Kerberos for authentication, then the same service account must be used across all instances supporting a given Availability group. Without this consistency, you will not be able to correctly register a SPN. See this article for more information.

randomnote1 commented 7 years ago

Two things I'd like to get your opinion on:

Different Service Accounts per Instance Here's what I'm thinking for handling/supporting a different service account per instance:

In "AlwaysOnAvailabilityGroupReplica"

This will still allow different service accounts to be used but also notify the the user that authentication may not work as desired.

Permissioning "NT SERVICE\ClusSvc" Going back to our discussion from earlier, do you think the AlwaysOnAvailabilityGroup resource should

johlju commented 7 years ago

Been a crazy workday today, so haven't been able to post until now.

Regarding Different Service Accounts per Instance. I would say to issue a warning. We should not change accounts. I don't think we can either, because we don't have the password.

Regarding Permission for "NT SERVICE\ClusSvc". If we would create the login and set the permission, then both 'AlwaysOnAvailabilityGroup' and 'AlwaysOnAvailabilityGroup' would need to have the logic to create the correct login and permission. Logic that is already (or should) be provided by the other resources. Also 'AlwaysOnAvailabilityGroup' and 'AlwaysOnAvailabilityGroup' must be able to handle any deviation from that, if the login disappears or any permission is changed (so that it is in desired state). That is also logic that is (or should) be provided by other resources. I would lean to that either 'NT SERVICE\ClusSvc' or 'NT AUTHORITY\SYSTEM' must exist (by means of other resources) with the correct permission for the Availability Group to be created, or a replica is able to join. If a primary or secondary replica does not have either login present, and with correct permission, the resource throws an error. This could also be noted in the resource documentation (README.md so far), that it is a prerequisite.

What do you think?

johlju commented 7 years ago

Btw. I mention 'NT AUTHORITY\SYSTEM' still because MSDN documentation still says that is the way to go, so thinking some users might want to follow those guidelines.

randomnote1 commented 7 years ago

Thanks for the feedback. I emailed one of the content developers I know to ask about getting that updated. In the meantime, I'll specify both SYSTEM and ClusSvc, but I hope to be able to remove SYSTEM by the time I do a pull request.

johlju commented 7 years ago

That sounds good to me. Looking forward to the PR.

johlju commented 7 years ago

As a reference, I found two more links mentioning 'NT AUTHORITY\SYSTEM' is needed.

SQL Server – Failed to bring availability group ‘[availability group name]’ online https://dataginger.com/2014/10/28/sql-server-failed-to-bring-availability-group-availability-group-name-online/

Pro SQL Server Always On Availability Groups https://books.google.se/books?id=uflRDQAAQBAJ&pg=PA262&lpg=PA262&dq=availability+group+health+detection+account+NT+AUTHORITY%5CSYSTEM&source=bl&ots=q1ekXhHHIc&sig=TekuOSle2r1ldqWmqXNPz5ZbpOM&hl=en&sa=X&ved=0ahUKEwiovN7dvefQAhVBVSwKHRAxAOYQ6AEINDAE#v=onepage&q=availability%20group%20health%20detection%20account%20NT%20AUTHORITY%5CSYSTEM&f=false

randomnote1 commented 7 years ago

That's unfortunate.

Here's a nice article by Pinal Dave showing what happens when you actually log in as System. SYSTEM is granted public by default. And because of that, anything logging into SQL as SYSTEM can't do much. As he points out, the login object gets translated (in his case to his computer name). The same applies to the cluster service. http://blog.sqlauthority.com/2016/02/08/sql-server-how-to-connect-using-nt-authoritysystem-account/

We already use this method to grant SCOM the ability to log into SQL.: https://support.microsoft.com/en-us/kb/2667175 The Health Service runs as SYSTEM, but we don't grant SYSTEM the required permissions, we grant the permissions only to the HealthService service.

Anyhow....gotta get documentation updated. Maybe I should find a blog to publish this info to?

nabrond commented 7 years ago

@randomnote1 When validating the service account, you will likely run into a permission issue connecting to the primary replica. You should be able to check whether a PsDscRunAsCredential has been supplied to the resource. Otherwise, you will need to grant access to the computer account.

johlju commented 7 years ago

@randomnote1 Regarding finding a blog to publish info on. This would be a good blog article for the SQL Tiger Team. What actually happens during "Health Detection" for an availability group 😄

@nabrond You make a valid point.

johlju commented 7 years ago

@randomnote1 FYI. Found another article regarding this subject https://support.microsoft.com/en-us/kb/2847723