This repository contains scripts, code and samples for automating the install and configuration of ArcGIS (Enterprise and Desktop) using Microsoft Windows PowerShell DSC (Desired State Configuration).
We have attempted to upgrade an HA Enterprise deployment from 10.9.1 to 11.1 and discovered that the process is failing. Our current thought/testing is that this appears to be caused by the order of the nodes defined in the json config. Specifically, the Portal nodes need to be ordered so that the primary Portal instance in the HA site is listed first.
Based off the logic defined in the Invoke-PortalUpgradeScript function, the primary/secondary machines are determined by the order in which they are listed in the json config. This determination is then used to kick off a step on the assumed secondary Portal which only updates the Portal DataStore host identifier prop file at C:\Program Files\ArcGIS\Portal\framework\runtime\ds\framework\etc\hostidentifier.properties and then restarts the Portal service. The actual post upgrade step is then carried out upon the assumed primary.
The documentation for upgrading Portal states ... then start the upgrade process on either machine which seems to indicate the issue via DSC may be related to the restarting of the assumed secondary Portal which is in fact the primary portal based on the actual site configuration.
We have been able to reproduce this in two separate HA deployments as well as found that we can work around it by ensuring the primary portal site is listed first in the json config.
We are not sure if Portal attempts a half-baked failover when the primary goes down for a restart during an upgrade but if it does, that could explain what we are seeing in our testing.
Steps to Reproduce
Deploy an HA portal environment at 10.9.1 with DSC
Within the config provided above (used for upgrading to 11.1), set the second node in the array to the primary machine in the HA portal site. You should verify which machine is listed as primary via .../portaladmin/machines
Start the upgrade site which should fail on the PortalPostUpgrade step.
The PortalPostUpgrade step gets to the final phase (Upgrade standby machine) and then errors out with
{"lastUpdated":1691524771647,"name":"Upgrade database","startTime":1691524615102,"state":"completed"},{"lastUpdated":1691524824307,"name":"Migrate configuration settings","startTime":1691524822608,"state":"completed"},{"lastUpdated":1691524919123,"name":"Update configuration settings","startTime":1691524877634,"state":"completed"},{"lastUpdated":1691524877634,"name":"Configure index service","startTime":1691524844022,"state":"completed"},{"lastUpdated":1691525041227,"name":"Reindex","startTime":1691524920167,"state":"completed"},{"lastUpdated":1691525930553,"name":"Upgrade standby machine","startTime":1691525210616,"state":"failed"}],"messages":["Index Service configuration failed."],"recheckAfterSeconds":20}
Community Note
Module Version
Affected Resource(s)
Configuration Files
Expected Behavior
The HA portal deployment is successfully upgraded
Actual Behavior
The HA portal deployment fails
Description
We have attempted to upgrade an HA Enterprise deployment from 10.9.1 to 11.1 and discovered that the process is failing. Our current thought/testing is that this appears to be caused by the order of the nodes defined in the json config. Specifically, the Portal nodes need to be ordered so that the primary Portal instance in the HA site is listed first.
Based off the logic defined in the
Invoke-PortalUpgradeScript
function, the primary/secondary machines are determined by the order in which they are listed in the json config. This determination is then used to kick off a step on the assumed secondary Portal which only updates the Portal DataStore host identifier prop file atC:\Program Files\ArcGIS\Portal\framework\runtime\ds\framework\etc\hostidentifier.properties
and then restarts the Portal service. The actual post upgrade step is then carried out upon the assumed primary.The documentation for upgrading Portal states
... then start the upgrade process on either machine
which seems to indicate the issue via DSC may be related to the restarting of the assumed secondary Portal which is in fact the primary portal based on the actual site configuration.We have been able to reproduce this in two separate HA deployments as well as found that we can work around it by ensuring the primary portal site is listed first in the json config.
We are not sure if Portal attempts a half-baked failover when the primary goes down for a restart during an upgrade but if it does, that could explain what we are seeing in our testing.
Steps to Reproduce
.../portaladmin/machines
PortalPostUpgrade
step.PortalPostUpgrade
step gets to the final phase (Upgrade standby machine
) and then errors out withImportant Factoids
References