Closed rtorrero closed 3 years ago
@rtorrero In which of the 2 nodes is this error happening?
@rtorrero In which of the 2 nodes is this error happening?
vmhana01
I am able to easily reproduce this on a successfully deployed HANA cluster in azure by doing ...
vmhana01:~ # systemctl stop corosync;systemctl stop hawk;systemctl disable hawk; rm -rf /var/lib/pacemaker/*/*
vmhana02:~ # systemctl stop corosync;systemctl stop hawk;systemctl disable hawk; rm -rf /var/lib/pacemaker/*/*
vmhana01:~ # salt-call --local state.apply cluster
vmhana02:~ # salt-call --local state.apply cluster
Looking at the cluster state while the salt-call runs, I noticed that the cluster itself is initialized correctly. The issue seems to be that corosync is restarted in my use case.
----------
ID: corosync_service
Function: service.running
Name: corosync
Result: True
Comment: Service restarted
Started: 10:53:45.217711
Duration: 2671.614 ms
Changes:
----------
corosync:
True
----------
It takes quite a while to come up and the configure-sbd-resource
, configure-cluster-properties
and configure-the-cluster
states are run before it is completely restarted.
The solution in my case was to build in a crm cluster wait_for_startup
loop after the corosync restart.
You can find an PR fixing this here: https://github.com/SUSE/habootstrap-formula/pull/86
@rtorrero SUSE/habootstrap-formula#86 was just merged.
Could you please try it out via setting this in terraform.tfvars
...
ha_sap_deployment_repo = "https://download.opensuse.org/repositories/network:/ha-clustering:/sap-deployments:/devel/SLE_15_SP2/"
And making sure habootstrap-formula-0.4.4
is used?
This issue doesn't seem to be happening anymore, thanks!
Used cloud platform Azure
Used SLES4SAP version SLES15SP2
Used client machine OS Linux
Expected behaviour vs observed behaviour While attempting to deploy a 2 node cluster (+monitoring) in azure, terraform will often fail almost at the end of the process with the following message:
How to reproduce Specify the step by step process to reproduce the issue. This usually would look like something like this:
terraform.tfvars
file based on shared file in this bug reportUsed terraform.tfvars
Logs
Please, request any additional logs privately as they might have secrets that I missed removing.