Closed ab-mohamed closed 2 years ago
I just was able to deploy a complete NW stack with the latest master
code and ha_sap_deployment_repo = "https://download.opensuse.org/repositories/network:/ha-clustering:/sap-deployments:/v7/"
I cannot see any issues with the ha_cluster_exporter
binary on my system:
test-netweaver01:~ # ls -la /usr/bin/ha_cluster_exporter
-rwxr-xr-x 1 root root 10313728 Jan 7 00:27 /usr/bin/ha_cluster_exporter
test-netweaver01:~ # file /usr/bin/ha_cluster_exporter
/usr/bin/ha_cluster_exporter: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, stripped
I used NW750 and will try S/4HANA 2020 next, as #804 also shows issues there.
@ab-mohamed Could you specify which S/4HANA version you are using?
Version 2020/2021 will only work with code from https://github.com/SUSE/sapnwbootstrap-formula/pull/92 and https://github.com/SUSE/ha-sap-terraform-deployments/pull/808.
Please use the latest develop
branch of https://github.com/SUSE/ha-sap-terraform-deployments together with ha_sap_deployment_repo = "https://download.opensuse.org/repositories/network:/ha-clustering:/sap-deployments:/devel/"
and give it a shot. If this works for you, we can forward port things to the master
branch.
BUT, these changes should be totally unrelated to the "monitoring issues" you faced. I could not reproduce them.
@ab-mohamed any comment on the S/4HANA version?
Reopen in case this still happens with 8.0.0
release.
Used cloud platform GCP
Used SLES4SAP version SLES4SAP 15 SP2
Used client machine OS Google Cloud Shell
Expected behavior vs. observed behavior The Monitoring Server was not able to show the S/4HANA HA cluster status.
How to reproduce
master
branch, start a new S/4HANA deployment.Troubleshooting Steps
Node List:
Inactive Resources:
Migration Summary:
demo1-monitoring:~ # cat /etc/prometheus/prometheus.yml [...]
Dec 06 13:02:08 demo1-netweaver01 node_exporter[1174]: level=info ts=2021-12-06T13:02:08.287Z caller=node_exporter.go:113 collector=thermal_zone Dec 06 13:02:08 demo1-netweaver01 node_exporter[1174]: level=info ts=2021-12-06T13:02:08.287Z caller=node_exporter.go:113 collector=time Dec 06 13:02:08 demo1-netweaver01 node_exporter[1174]: level=info ts=2021-12-06T13:02:08.287Z caller=node_exporter.go:113 collector=timex Dec 06 13:02:08 demo1-netweaver01 node_exporter[1174]: level=info ts=2021-12-06T13:02:08.287Z caller=node_exporter.go:113 collector=udp_queues Dec 06 13:02:08 demo1-netweaver01 node_exporter[1174]: level=info ts=2021-12-06T13:02:08.287Z caller=node_exporter.go:113 collector=uname Dec 06 13:02:08 demo1-netweaver01 node_exporter[1174]: level=info ts=2021-12-06T13:02:08.287Z caller=node_exporter.go:113 collector=vmstat Dec 06 13:02:08 demo1-netweaver01 node_exporter[1174]: level=info ts=2021-12-06T13:02:08.287Z caller=node_exporter.go:113 collector=xfs Dec 06 13:02:08 demo1-netweaver01 node_exporter[1174]: level=info ts=2021-12-06T13:02:08.287Z caller=node_exporter.go:113 collector=zfs Dec 06 13:02:08 demo1-netweaver01 node_exporter[1174]: level=info ts=2021-12-06T13:02:08.287Z caller=node_exporter.go:195 msg="Listening on" address=:9100 Dec 06 13:02:08 demo1-netweaver01 node_exporter[1174]: level=info ts=2021-12-06T13:02:08.287Z caller=tls_config.go:191 msg="TLS is disabled." http2=false
demo1-netweaver01:~ # systemctl status prometheus-sap_host_exporter@HA1_ASCS00.service ● prometheus-sap_host_exporter@HA1_ASCS00.service - Cluster Controlled prometheus-sap_host_exporter@HA1_ASCS00 Loaded: loaded (/usr/lib/systemd/system/prometheus-sap_host_exporter@.service; disabled; vendor preset: disabled) Drop-In: /run/systemd/system/prometheus-sap_host_exporter@HA1_ASCS00.service.d └─50-pacemaker.conf Active: active (running) since Mon 2021-12-06 13:03:23 UTC; 2h 15min ago Docs: https://github.com/SUSE/sap_host_exporter Main PID: 4296 (sap_host_export) Tasks: 6 CGroup: /system.slice/system-prometheus\x2dsap_host_exporter.slice/prometheus-sap_host_exporter@HA1_ASCS00.service └─4296 /usr/bin/sap_host_exporter --config /etc/sap_host_exporter/HA1_ASCS00.yaml
Dec 06 13:03:23 demo1-netweaver01 systemd[1]: Started Cluster Controlled prometheus-sap_host_exporter@HA1_ASCS00. Dec 06 13:03:24 demo1-netweaver01 sap_host_exporter[4296]: time="2021-12-06T13:03:24Z" level=info msg="Using config file: /etc/sap_host_exporter/HA1_ASCS00.yaml" Dec 06 13:03:24 demo1-netweaver01 sap_host_exporter[4296]: time="2021-12-06T13:03:24Z" level=info msg="Monitoring SAP Instance SID: HA1, Name: ASCS00, Number: 0, Hostname: sapha1as" Dec 06 13:03:24 demo1-netweaver01 sap_host_exporter[4296]: time="2021-12-06T13:03:24Z" level=info msg="Start Service collector registered" Dec 06 13:03:24 demo1-netweaver01 sap_host_exporter[4296]: time="2021-12-06T13:03:24Z" level=info msg="Enqueue Server optional collector registered" Dec 06 13:03:24 demo1-netweaver01 sap_host_exporter[4296]: time="2021-12-06T13:03:24Z" level=info msg="Serving metrics on sapha1as:9680"
demo1-netweaver01:~ # /usr/bin/ha_cluster_exporter demo1-netweaver01:~ # echo $? 0
demo1-netweaver01:~ # ss -tupenl | grep -e State -e 9100 -e 9680 -e 9664 Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port
tcp LISTEN 0 128 10.0.1.34:9680 0.0.0.0: users:(("sap_host_export",pid=4296,fd=3)) ino:38372 sk:10 <-> tcp LISTEN 0 128 :9100 : users:(("node_exporter",pid=1174,fd=3)) uid:472 ino:25341 sk:16 v6only:0 <->
demo1-netweaver01:~ # ls -lh /usr/bin/ha_cluster_exporter -rwxr-xr-x 1 root root 0 Dec 3 09:16 /usr/bin/ha_cluster_exporter
demo1-netweaver01:~ # cat /usr/bin/ha_cluster_exporter demo1-netweaver01:~ # echo $? 0