Closed busetde closed 2 years ago
@busetde I again would need a terraform.tfvars and some more logs from you. The logs above do not show any concrete error message.
@yeoldegrove - Please kindly find below the terraform.tfvars
Let me know if there's any information needed...
@busetde What is the error message you get? The messages above are a bit generic ;)
@yeoldegrove - any particular log files that I need to provide?
@busetde The complete salt output would be enough for the beginning.
@yeoldegrove - I've run again the deployment below is the error:
module.netweaver_node.module.netweaver_provision.null_resource.provision[0] (remote-exec): Summary for local module.netweaver_node.module.netweaver_provision.null_resource.provision[0] (remote-exec): ------------- module.netweaver_node.module.netweaver_provision.null_resource.provision[0] (remote-exec): Succeeded: 31 (changed=26) module.netweaver_node.module.netweaver_provision.null_resource.provision[0] (remote-exec): Failed: 14 module.netweaver_node.module.netweaver_provision.null_resource.provision[0] (remote-exec): ------------- module.netweaver_node.module.netweaver_provision.null_resource.provision[0] (remote-exec): Total states run: 45 module.netweaver_node.module.netweaver_provision.null_resource.provision[0] (remote-exec): Total run time: 1591.483 s module.netweaver_node.module.netweaver_provision.null_resource.provision[0] (remote-exec): Wed Jun 1 05:13:30 UTC 2022::default-vmnetweaver01::[ERROR] predeployment failed ╷ │ Error: remote-exec provisioner error │ │ with module.netweaver_node.module.netweaver_provision.null_resource.provision[2], │ on ../generic_modules/salt_provisioner/main.tf line 78, in resource "null_resource" "provision": │ 78: provisioner "remote-exec" { │ │ error executing "/tmp/terraform_153627534.sh": Process exited with status 1 ╵ ╷ │ Error: remote-exec provisioner error │ │ with module.netweaver_node.module.netweaver_provision.null_resource.provision[3], │ on ../generic_modules/salt_provisioner/main.tf line 78, in resource "null_resource" "provision": │ 78: provisioner "remote-exec" { │ │ error executing "/tmp/terraform_1989768892.sh": Process exited with status 1 ╵ ╷ │ Error: remote-exec provisioner error │ │ with module.netweaver_node.module.netweaver_provision.null_resource.provision[0], │ on ../generic_modules/salt_provisioner/main.tf line 78, in resource "null_resource" "provision": │ 78: provisioner "remote-exec" { │ │ error executing "/tmp/terraform_1671757330.sh": Process exited with status 1 ╵ ╷ │ Error: remote-exec provisioner error │ │ with module.netweaver_node.module.netweaver_provision.null_resource.provision[1], │ on ../generic_modules/salt_provisioner/main.tf line 78, in resource "null_resource" "provision": │ 78: provisioner "remote-exec" { │ │ error executing "/tmp/terraform_1361496486.sh": Process exited with status 1
I've attached the salt* log from netweaver01
Let me know if there's any log that I can provide
Budi, nice to see you here, I privileged you on a to-be guide for Terraform for SUSE for SAP for Google: https://docs.google.com/document/d/1VW30Yg9K1IcYmcAVXC-0M2F2PHugzGYmEMO_mFuFlVI/edit
Hi Thorsten,
Really appreciated for sharing the documentation.
Regards, Budi
@busetde from your salt-result.log
I can see that this was the issue:
----------
ID: wait_until_nfs_is_ready_netweaver_node_sapmnt
Function: cmd.run
Name: until nc -zvw5 10.0.0.22 2049;do sleep 30;done
Result: False
Comment: Command "until nc -zvw5 10.0.0.22 2049;do sleep 30;done" run
Started: 04:48:44.733423
Duration: 1200008.712 ms
Changes:
----------
pid:
3710
retcode:
1
stderr:
stdout:
until nc -zvw5 10.0.0.22 2049;do sleep 30;done : Timed out after 1200 seconds
----------
Wich is a not-working DRBD cluster...
I released https://github.com/SUSE/ha-sap-terraform-deployments/releases/tag/8.1.4 earlier today which let's me successfully deploy on GCP with DRBD enabled. Please try again and let's debug further in case you issue persists.
@yeoldegrove - Likely still got the same error as below:
Please kindly advise if there's anything required for troubleshoot...
@busetde Still trying to reproduce you issue. No luck so far. Is the issue maybe related to the DRBD cluster? Could you check or send the logs?
Another thing... our develop
branch already includes some features to use google filestore as backend for HANA scale-out deployments and also netweaver (some code missing). Would you be interested in this feature?
@yeoldegrove - for DRBD logs, what logs required?
@busetde basically /var/log/salt*
from both DRBD nodes.
@yeoldegrove - The deployment still progressing but DRBD deployment is successful like below:
Will send /var/log/sal*
when deployment finished...
@yeoldegrove - The deployment failed...
Will sent you from both DRBD...
@yeoldegrove - Please find attached the DRBD salt logs from both VM.
Let me know if there's anything else...
@busetde The DRBD deployment seems to be successful from the logfiles.
Does the failed netcat from your screenshot above work?
default-vmnetweaver01:~ # nc -zv 10.0.0.22 2049
Connection to 10.0.0.22 2049 port [tcp/nfs] succeeded!
How does crm_mon -r1
look on the drbd nodes?
@yeoldegrove
Here's the crm_mon -r1
From vmdrbd01
From vmdrbd02
Let me know if there's any information required...
These error messages are "fine" and related to https://github.com/SUSE/ha-sap-terraform-deployments/issues/839 / https://bugzilla.suse.com/show_bug.cgi?id=1198872.
@busetde What about the netcat command?
@yeoldegrove - it's succeeded as below:
Any other information required?
Regards - Budi
So to sum it up... everything DRBD related is running. So this could be a timing issue.
@busetde To verify you could try tainting the netweaver provisioners and restart the salt run via another apply:
terraform taint "module.netweaver_node.null_resource.netweaver_provisioner[0]"
terraform taint "module.netweaver_node.null_resource.netweaver_provisioner[1]"
terraform taint "module.netweaver_node.null_resource.netweaver_provisioner[2]"
terraform taint "module.netweaver_node.null_resource.netweaver_provisioner[3]"
terraform apply -auto-approve
and/or you could try raising the 1200s/20m timeout here and do a complete new deployment: https://github.com/SUSE/ha-sap-terraform-deployments/blob/main/salt/shared_storage/nfs.sls#L106
20m could be indeed to low for some regions or certain sizing.
@yeoldegrove - with tainting the netweaver still error... Proceeding to destroy, raising to 30m timeout, apply... and will update you...
@busetde Just raise it to 120m or something to be on the safe site.
@yeoldegrove - I've changed the time and enabled vpc_name previously with # 'vpc_name = "slesnetwork"' Deployment completed successfully...
Will close this issues... Likely not because the timeout later will give it a try again...
Thanks @yeoldegrove
@busetde feel free to raise an issue to raise the timeout when needed and reference this issue.
@yeoldegrove - all good tested with the timeout of 20mins (1200), reason is the wrong configuration on vpc_name and subnet...
Thanks - Budi
@yeoldegrove - Apologize to create new issues here, am trying to deploy but got error below:
Error: remote-exec provisioner error │ │ with module.netweaver_node.module.netweaver_provision.null_resource.provision[3], │ on ../generic_modules/salt_provisioner/main.tf line 78, in resource "null_resource" "provision": │ 78: provisioner "remote-exec" { │ │ error executing "/tmp/terraform_638753395.sh": Process exited with status 1 ╵ ╷ │ Error: remote-exec provisioner error │ │ with module.hana_node.module.hana_provision.null_resource.provision[1], │ on ../generic_modules/salt_provisioner/main.tf line 78, in resource "null_resource" "provision": │ 78: provisioner "remote-exec" { │ │ error executing "/tmp/terraform_873561226.sh": Process exited with status 1 ╵ ╷ │ Error: remote-exec provisioner error │ │ with module.netweaver_node.module.netweaver_provision.null_resource.provision[0], │ on ../generic_modules/salt_provisioner/main.tf line 78, in resource "null_resource" "provision": │ 78: provisioner "remote-exec" { │ │ error executing "/tmp/terraform_1573237755.sh": Process exited with status 1 ╵ ╷ │ Error: remote-exec provisioner error │ │ with module.netweaver_node.module.netweaver_provision.null_resource.provision[2], │ on ../generic_modules/salt_provisioner/main.tf line 78, in resource "null_resource" "provision": │ 78: provisioner "remote-exec" { │ │ error executing "/tmp/terraform_278602173.sh": Process exited with status 1 ╵ ╷ │ Error: remote-exec provisioner error │ │ with module.netweaver_node.module.netweaver_provision.null_resource.provision[1], │ on ../generic_modules/salt_provisioner/main.tf line 78, in resource "null_resource" "provision": │ 78: provisioner "remote-exec" { │ │ error executing "/tmp/terraform_1649281273.sh": Process exited with status 1 ╵
Could you please kindly advise?
Regards - Budi