Esri / arcgis-powershell-dsc

This repository contains scripts, code and samples for automating the install and configuration of ArcGIS (Enterprise and Desktop) using Microsoft Windows PowerShell DSC (Desired State Configuration).
Apache License 2.0
121 stars 62 forks source link

GeoEvent-Setup fails with 3.0.1 and 3.0.2 #257

Closed scma-esrich closed 4 years ago

scma-esrich commented 4 years ago

Community Note

Module Version

Affected Resource(s)

Configuration Files

The DSC-config should not be causing this problem.

Expected Behavior

The ArcGISGeoEvent-service AND the ArcGISGeoEventGateway-service should both be switched to the correct service-user and the ArcGISGeoEvent-service should start up the ArcGISGeoEventGateway-service due to its dependency on the ArcGISGeoEventGateway-service.

Actual Behavior

The ArcGISGeoEventGateway-service-user is still the local system account and hence is not started when the ArcGISGeoEvent-service is starting up:

ArcGIS GeoEvent Gateway - wrong user

Steps to Reproduce

Install GeoEvent-server (10.8) with DSC version 3.0.1

Important Factoids

As far as I was able to debug the behavior, the problem seems to start with line 79 in "ArcGIS_GeoEvent.psm1". The variable $GatewayServiceName is never used in the subsequent-code, hence its service-user won't get switched to the domain-user.

image

References

scma-esrich commented 4 years ago

@spitzerr @tpaschke-esride @esride-dku, FYI

scma-esrich commented 4 years ago

@shailesh91 and @cameronkroeker, any news on this issue?

cameronkroeker commented 4 years ago

Hello @scma-esrich ,

The Install resource sets the service account to LocalSystem for all the ArcGIS Enterprise components, including GeoEvent Server. The run-as account for each service gets changed from LocalSystem to the service account specified in the json file during the configuration phase.

So if we only pass -Mode Install then the run-as will be set to LocalSystem, but if we use -Mode InstallLicenseConfigure it will then change the service run as account from LocalSystem to the one specified in the json file.

cameronkroeker commented 4 years ago

@scma-esrich

I have not been able to reproduce the issue. Are you able to attach the DSC logs and json file?

Try running with -DebugSwitch parameter, it should display additional logs. For example, I found the following messages logged in my test environment which updated both services successfully:

[NodeName]:[[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service] User name for service 'ArcGISGeoEvent' is 'LocalSystem'. It does not match '.\arcgis.
...
[NodeName]:[[ArcGIS_WindowsService]ArcGIS_GeoEventGateway_Service] User name for service 'ArcGISGeoEventGateway' is 'LocalSystem'. It does not match '.\arcgis.

Once installation completed:

GES-108-LocalSystem

Once Configuration completed:

GES-108-arcgis

scma-esrich commented 4 years ago

@cameronkroeker

The deployment was run with the mode -InstallLicenseConfigure and the -DebugSwitch was on. In the logs, there are only two blocks containing ArcGIS_WindowsService:

One at the very top concerning ArcGIS Server itself:

[NodeName]: LCM:  [ Start  Resource ]  [[ArcGIS_WindowsService]ArcGIS_for_Server_Service]
[NodeName]: LCM:  [ Start  Test     ]  [[ArcGIS_WindowsService]ArcGIS_for_Server_Service]
[NodeName]:                            [[ArcGIS_WindowsService]ArcGIS_for_Server_Service] User name for service 'ArcGIS Server' is 'LocalSystem'. It does not match 'net.w
ork\T-WebgisDevE.
[NodeName]: LCM:  [ End    Test     ]  [[ArcGIS_WindowsService]ArcGIS_for_Server_Service]  in 0.1820 seconds.
[NodeName]: LCM:  [ Start  Set      ]  [[ArcGIS_WindowsService]ArcGIS_for_Server_Service]
[NodeName]:                            [[ArcGIS_WindowsService]ArcGIS_for_Server_Service] Service 'ArcGIS Server' already started, no action required.
[NodeName]: LCM:  [ End    Set      ]  [[ArcGIS_WindowsService]ArcGIS_for_Server_Service]  in 1.3170 seconds.
[NodeName]: LCM:  [ End    Resource ]  [[ArcGIS_WindowsService]ArcGIS_for_Server_Service]

And only one concerning GeoEvent-server itself:

[GeoEventNode]: LCM:  [ Start  Resource ]  [[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service]
[GeoEventNode]: LCM:  [ Start  Test     ]  [[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service]
[GeoEventNode]:                            [[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service] User name for service 'ArcGISGeoEvent' is 'LocalSystem'. It does not match 'net.wo
rk\ServiceUser.
[GeoEventNode]: LCM:  [ End    Test     ]  [[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service]  in 0.0120 seconds.
[GeoEventNode]: LCM:  [ Start  Set      ]  [[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service]
[GeoEventNode]:                            [[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service] Service 'ArcGISGeoEvent' already started, no action required.
[GeoEventNode]: LCM:  [ End    Set      ]  [[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service]  in 0.0430 seconds.
[GeoEventNode]: LCM:  [ End    Resource ]  [[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service]

I attached the log-file - with replaced customer-environment parameters - hope, this helps.

cameronkroeker commented 4 years ago

Thanks @scma-esrich for providing those additional details. The [[ArcGIS_WindowsService]ArcGIS_GeoEventGateway_Service] is the final step of the configuration, very strange that this is being skipped entirely.

What is the version of Windows? And does this only occur in this one instance or it happens on different nodes as well? Does this also occur when using a local account instead of the domain account (net.work\ServiceUser)? Are there any useful hints in the Windows Event Viewer logs?

Nickolaitc commented 4 years ago

Hey @scma-esrich , is there any chance you could share with us your config JSON being utilized? (redacted if needed) Cameron asked some questions as well:

"What is the version of Windows? And does this only occur in this one instance or it happens on different nodes as well? Does this also occur when using a local account instead of the domain account (net.work\ServiceUser)? Are there any useful hints in the Windows Event Viewer logs?"

We have been unable to reproduce this issue.

scma-esrich commented 4 years ago

@Nickolaitc and @cameronkroeker: Sorry for the delay!

I have not been able hold another remote-session with the customer yet. It is scheduled for next week now and I will add the additional info here as soon, as I get to them.

scma-esrich commented 4 years ago

We finally managed another (failed) run. Please find the (redacted) config JSON attached: github - GeoEvent-Config.zip

As to Cameron's questions:

Log Name: System Source: Service Control Manager Date: 25.05.2020 16:08:32 Event ID: 7034 Task Category: None Level: Error Keywords: Classic User: N/A Computer: GeoEventNode.net.work Description: The ArcGIS GeoEvent Server service terminated unexpectedly. It has done this 1 time(s). Event Xml:

7034 0 2 0 0 0x8080000000000000 56638 System GeoEventNode.net.work ArcGIS GeoEvent Server 1 410072006300470049005300470065006F004500760065006E0074000000 Log Name: System Source: Microsoft-Windows-DistributedCOM Date: 25.05.2020 15:57:47 Event ID: 10016 Task Category: None Level: Error Keywords: Classic User: NET\T-WebgisDevE Computer: GeoEventNode.net.work Description: The machine-default permission settings do not grant Local Activation permission for the COM Server application with CLSID {0358B920-0AC7-461F-98F4-58E32CD89148} and APPID {3EB3C877-1F16-487C-9050-104DBCD66683} to the user NET\T-WebgisDevE SID (S-1-5-21-REDACTED) from address LocalHost (Using LRPC) running in the application container Unavailable SID (Unavailable). This security permission can be modified using the Component Services administrative tool. Event Xml: 10016 0 2 0 0 0x8080000000000000 56609 System GeoEventNode.net.work machine-default Local Activation {0358B920-0AC7-461F-98F4-58E32CD89148} {3EB3C877-1F16-487C-9050-104DBCD66683} NET T-WebgisDevE S-1-5-21-REDACTED LocalHost (Using LRPC) Unavailable Unavailable ```
cameronkroeker commented 4 years ago

@scma-esrich thanks for the updated information. I've tested both local and domain accounts on the same OS and haven't been able to reproduce the issue.

By chance could you provide the entire DSC logs (zip and attach)? In the event logs its referring to the user as NET\T-WebgisDevE. Perhaps using this syntax rather than net.work\T-WebgisDevE in the json file may yield a different result?

Thanks, Cameron K.

scma-esrich commented 4 years ago

@cameronkroeker, we tested now with both NET\T-WebgisDevE as well as net.work\T-WebgisDevE:

We did find DCOM-Errors around the time, the configuration is applied as well as unexpected GeoEvent-Server service shutdowns in the Windows Event-logs again.

To give you all the necessary info, I would prefer to send the relevant Windows Event-logs as well as the complete and unredacted JSON-config to you by e-mail. Is that okay with you?

cameronkroeker commented 4 years ago

Hi @scma-esrich,

Thanks for the update. I think it might actually be best to open a case with Esri Technical Support. This way that information can be sent securely and examined by an analyst who will be able to help investigate the issue further.

Thanks, Cameron K.

cameronkroeker commented 4 years ago

Hi @scma-esrich,

I was able to finally reproduce the issue. Previously my deployment machine was the same as the orchestrating machine which doesn't produce the issue. But in your case the orchestrating machine is different than the deployment machine which causes the issue (this was the missing piece to the puzzle). We will address this issue in a future release, in the meantime here are two workarounds:

Workaround 1: Run the Invoke-ArcGISConfiguration from the deployment machine (GeoEvent Server node). This makes the orchestrating machine same as the deployment machine and should work.

Workaround 2: In the following resource: https://github.com/Esri/arcgis-powershell-dsc/blob/master/Modules/ArcGIS/Configurations-OnPrem/ArcGISServer.ps1#L474-L485

  1. Comment these three lines 474,475, 485
  2. Move lines 414-423 to after line 485

Thanks, Cameron K.

scma-esrich commented 4 years ago

@cameronkroeker, thanks for the update! Glad, you finally managed to reproduce the issue and tackle down the reason.

Together with the customer, I already applied Workaround 1 successfully in his environment, which works fine until you can fix this issue with a future release.

Thanks again for your efforts, much appreciated!

cameronkroeker commented 4 years ago

Hi @scma-esrich ,

We have fixed this issue in v3.1.1.

https://github.com/Esri/arcgis-powershell-dsc/releases/tag/v3.1.1

Closing the issue, however if you re-encounter it please feel free to reopen.