zhmcclient / zhmc-prometheus-exporter

A Prometheus exporter for the IBM Z HMC
Apache License 2.0
14 stars 9 forks source link

Relogin HMC when HTTP 403.4 #351

Closed Charles1000Chen closed 1 year ago

Charles1000Chen commented 1 year ago

The zhmc exporter keeps reporting the following "HTTP 403.4" error after MCL applyied on HMC and no metrics can be retrieved.

HTTP authentication failed: No session id was provided (must login) (HMC operation GET /api/services/metrics/context/1, HTTP status 403.4)

The root cause should be that the zhmc exporter didn't refresh the session with HMC after HMC rebooted for MCL upgrade.

Analysis from Rene Petry: The reboot during the MCL apply would only trigger a new login requirement as the session got lost for this monitoring ID. So it may be that the password expired shortly before and the code was not able to reconnect to the session because of the reboot, while the session before was still alive and kept active.

@andy-maier Could the zhmc exporter catch the error and relogin to HMC?

andy-maier commented 1 year ago

Charles1000Chen told me privately that the versions used in this case were:

zhmc-prometheus-exporter: 1.4.2 (and the zhmcclient 1.8.1 it requires) have several fixes and improvements in the area of HMC restart, and I tested that it works with a GUI triggered HMC restart.

I'll do another test with the old and new versions and will post the result.

andy-maier commented 1 year ago

I have tried to reproduce the 403.4 error (NO SESSION ID) by restarting the HMC, but all I could achieve is a 403.5 error (INVALID SESSION ID), which had always be handled.

I double checked the changes in the code and zhmcclient 1.8.1 is the version that added handling of the 403.4 error, and zhmc-prometheus-exporter 1.4.2 added retry upon any server auth errors not handled by the zhmcclient library.

Please use these versions and reopen this ticket or create a new one if you still run into this issue.