Closed scma-esrich closed 4 years ago
@spitzerr @tpaschke-esride @esride-dku, FYI
@shailesh91 and @cameronkroeker, any news on this issue?
Hello @scma-esrich ,
The Install resource sets the service account to LocalSystem for all the ArcGIS Enterprise components, including GeoEvent Server. The run-as account for each service gets changed from LocalSystem to the service account specified in the json file during the configuration phase.
So if we only pass -Mode Install
then the run-as will be set to LocalSystem, but if we use -Mode InstallLicenseConfigure
it will then change the service run as account from LocalSystem to the one specified in the json file.
@scma-esrich
I have not been able to reproduce the issue. Are you able to attach the DSC logs and json file?
Try running with -DebugSwitch
parameter, it should display additional logs. For example, I found the following messages logged in my test environment which updated both services successfully:
[NodeName]:[[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service] User name for service 'ArcGISGeoEvent' is 'LocalSystem'. It does not match '.\arcgis.
...
[NodeName]:[[ArcGIS_WindowsService]ArcGIS_GeoEventGateway_Service] User name for service 'ArcGISGeoEventGateway' is 'LocalSystem'. It does not match '.\arcgis.
Once installation completed:
Once Configuration completed:
@cameronkroeker
The deployment was run with the mode -InstallLicenseConfigure
and the -DebugSwitch
was on.
In the logs, there are only two blocks containing ArcGIS_WindowsService
:
One at the very top concerning ArcGIS Server itself:
[NodeName]: LCM: [ Start Resource ] [[ArcGIS_WindowsService]ArcGIS_for_Server_Service]
[NodeName]: LCM: [ Start Test ] [[ArcGIS_WindowsService]ArcGIS_for_Server_Service]
[NodeName]: [[ArcGIS_WindowsService]ArcGIS_for_Server_Service] User name for service 'ArcGIS Server' is 'LocalSystem'. It does not match 'net.w
ork\T-WebgisDevE.
[NodeName]: LCM: [ End Test ] [[ArcGIS_WindowsService]ArcGIS_for_Server_Service] in 0.1820 seconds.
[NodeName]: LCM: [ Start Set ] [[ArcGIS_WindowsService]ArcGIS_for_Server_Service]
[NodeName]: [[ArcGIS_WindowsService]ArcGIS_for_Server_Service] Service 'ArcGIS Server' already started, no action required.
[NodeName]: LCM: [ End Set ] [[ArcGIS_WindowsService]ArcGIS_for_Server_Service] in 1.3170 seconds.
[NodeName]: LCM: [ End Resource ] [[ArcGIS_WindowsService]ArcGIS_for_Server_Service]
And only one concerning GeoEvent-server itself:
[GeoEventNode]: LCM: [ Start Resource ] [[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service]
[GeoEventNode]: LCM: [ Start Test ] [[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service]
[GeoEventNode]: [[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service] User name for service 'ArcGISGeoEvent' is 'LocalSystem'. It does not match 'net.wo
rk\ServiceUser.
[GeoEventNode]: LCM: [ End Test ] [[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service] in 0.0120 seconds.
[GeoEventNode]: LCM: [ Start Set ] [[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service]
[GeoEventNode]: [[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service] Service 'ArcGISGeoEvent' already started, no action required.
[GeoEventNode]: LCM: [ End Set ] [[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service] in 0.0430 seconds.
[GeoEventNode]: LCM: [ End Resource ] [[ArcGIS_WindowsService]ArcGIS_GeoEvent_Service]
I attached the log-file - with replaced customer-environment parameters - hope, this helps.
Thanks @scma-esrich for providing those additional details. The [[ArcGIS_WindowsService]ArcGIS_GeoEventGateway_Service]
is the final step of the configuration, very strange that this is being skipped entirely.
What is the version of Windows? And does this only occur in this one instance or it happens on different nodes as well? Does this also occur when using a local account instead of the domain account (net.work\ServiceUser)? Are there any useful hints in the Windows Event Viewer logs?
Hey @scma-esrich , is there any chance you could share with us your config JSON being utilized? (redacted if needed) Cameron asked some questions as well:
"What is the version of Windows? And does this only occur in this one instance or it happens on different nodes as well? Does this also occur when using a local account instead of the domain account (net.work\ServiceUser)? Are there any useful hints in the Windows Event Viewer logs?"
We have been unable to reproduce this issue.
@Nickolaitc and @cameronkroeker: Sorry for the delay!
I have not been able hold another remote-session with the customer yet. It is scheduled for next week now and I will add the additional info here as soon, as I get to them.
We finally managed another (failed) run. Please find the (redacted) config JSON attached: github - GeoEvent-Config.zip
As to Cameron's questions:
Source: Tcpip
Date: 25.05.2020 16:08:39
Event ID: 4227
Task Category: None
Level: Warning
Keywords: Classic
User: N/A
Computer: GeoEventNode.net.work
Description:
TCP/IP failed to establish an outgoing connection because the selected local endpoint was recently used to connect to the same remote endpoint. This error typically occurs when outgoing connections are opened and closed at a high rate, causing all available local ports to be used and forcing TCP/IP to reuse a local port for an outgoing connection. To minimize the risk of data corruption, the TCP/IP standard requires a minimum time period to elapse between successive connections from a given local endpoint to a given remote endpoint.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Tcpip" />
<EventID Qualifiers="32768">4227</EventID>
<Level>3</Level>
<Task>0</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2020-05-25T14:08:39.011750800Z" />
<EventRecordID>56641</EventRecordID>
<Channel>System</Channel>
<Computer>GeoEventNode.net.work</Computer>
<Security />
</System>
<EventData>
<Data>
</Data>
<Binary>00000000010000000000000083100080000000000000000000000000000000000000000000000000</Binary>
</EventData>
</Event>
Log Name: System Source: Service Control Manager Date: 25.05.2020 16:08:32 Event ID: 7034 Task Category: None Level: Error Keywords: Classic User: N/A Computer: GeoEventNode.net.work Description: The ArcGIS GeoEvent Server service terminated unexpectedly. It has done this 1 time(s). Event Xml:
@scma-esrich thanks for the updated information. I've tested both local and domain accounts on the same OS and haven't been able to reproduce the issue.
By chance could you provide the entire DSC logs (zip and attach)? In the event logs its referring to the user as NET\T-WebgisDevE
. Perhaps using this syntax rather than net.work\T-WebgisDevE
in the json file may yield a different result?
Thanks, Cameron K.
@cameronkroeker, we tested now with both NET\T-WebgisDevE
as well as net.work\T-WebgisDevE
:
We did find DCOM-Errors around the time, the configuration is applied as well as unexpected GeoEvent-Server service shutdowns in the Windows Event-logs again.
To give you all the necessary info, I would prefer to send the relevant Windows Event-logs as well as the complete and unredacted JSON-config to you by e-mail. Is that okay with you?
Hi @scma-esrich,
Thanks for the update. I think it might actually be best to open a case with Esri Technical Support. This way that information can be sent securely and examined by an analyst who will be able to help investigate the issue further.
Thanks, Cameron K.
Hi @scma-esrich,
I was able to finally reproduce the issue. Previously my deployment machine was the same as the orchestrating machine which doesn't produce the issue. But in your case the orchestrating machine is different than the deployment machine which causes the issue (this was the missing piece to the puzzle). We will address this issue in a future release, in the meantime here are two workarounds:
Workaround 1:
Run the Invoke-ArcGISConfiguration
from the deployment machine (GeoEvent Server node). This makes the orchestrating machine same as the deployment machine and should work.
Workaround 2: In the following resource: https://github.com/Esri/arcgis-powershell-dsc/blob/master/Modules/ArcGIS/Configurations-OnPrem/ArcGISServer.ps1#L474-L485
Thanks, Cameron K.
@cameronkroeker, thanks for the update! Glad, you finally managed to reproduce the issue and tackle down the reason.
Together with the customer, I already applied Workaround 1 successfully in his environment, which works fine until you can fix this issue with a future release.
Thanks again for your efforts, much appreciated!
Hi @scma-esrich ,
We have fixed this issue in v3.1.1.
https://github.com/Esri/arcgis-powershell-dsc/releases/tag/v3.1.1
Closing the issue, however if you re-encounter it please feel free to reopen.
Community Note
Module Version
Affected Resource(s)
Configuration Files
The DSC-config should not be causing this problem.
Expected Behavior
The ArcGISGeoEvent-service AND the ArcGISGeoEventGateway-service should both be switched to the correct service-user and the ArcGISGeoEvent-service should start up the ArcGISGeoEventGateway-service due to its dependency on the ArcGISGeoEventGateway-service.
Actual Behavior
The ArcGISGeoEventGateway-service-user is still the local system account and hence is not started when the ArcGISGeoEvent-service is starting up:
Steps to Reproduce
Install GeoEvent-server (10.8) with DSC version 3.0.1
Important Factoids
As far as I was able to debug the behavior, the problem seems to start with line 79 in "ArcGIS_GeoEvent.psm1". The variable $GatewayServiceName is never used in the subsequent-code, hence its service-user won't get switched to the domain-user.
References