bb-Ricardo / check_redfish

A monitoring/inventory plugin to check components and health status of systems which support Redfish. It will also create a inventory of all components of a system.
MIT License
110 stars 30 forks source link

ILO4 max retries exhausted #94

Closed bc-networks closed 1 year ago

bc-networks commented 1 year ago

Hi, i am getting 'max retries exhausted' an a few ILO4 servers all with fw2.80 when checking --storage. I noticed, that ILO4 may answer a bit slow, but after setting timeout to 120, got same result. Other parameters --fan, --info are running fine, also with --detailed. Do you have any suggestion, how to figger out the problem, or get it working?

regards

bb-Ricardo commented 1 year ago

Hi,

you could try to use the -v option and see which request takes so long to return.

bc-networks commented 1 year ago

Hi,

looks like that, already restartet ILO, and through ILO webinterface storage controller views normal.

2022-08-16 10:40:05,722 - DEBUG: HTTP REQUEST: GET
        PATH: /redfish/v1/Systems/1/SmartStorage/ArrayControllers?$expand=.
        BODY: None
2022-08-16 10:40:05,722 - INFO: Attempt 1 of /redfish/v1/Systems/1//SmartStorage/ArrayControllers?$expand=.
2022-08-16 10:40:05,785 - DEBUG: https://192.168.112.101:8443 "GET /redfish/v1/Systems/1/SmartStorage/ArrayControllers?$expand=. HTTP/1.1" 308 0
2022-08-16 10:40:05,788 - DEBUG: Starting new HTTPS connection (1): 192.168.10.1:443
2022-08-16 10:40:12,796 - INFO: Retrying /redfish/v1/Systems/1//SmartStorage/ArrayControllers?$expand=. [HTTPSConnectionPool(host='192.168.10.1', port=443): Max retries exceeded with url: /redfish/v1/Systems/1/SmartStorage/ArrayControllers/?$expand=. (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f077f809be0>, 'Connection to 192.168.10.1 timed out. (connect timeout=7)'))]

Problem seems, that the forwarded ip/port gets corruptet because of urllib3 connection reset 'DEBUG: Starting new HTTPS connection'. The option -t seems not to have any effect to this.

regards

bb-Ricardo commented 1 year ago

hmm, did you change the default HTTPS port to 8443?

Somehow the iLO answers with a 308 (https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/308) and then urrlib tries to connect to 443. I guess this is what iLO returned.

bc-networks commented 1 year ago

Hi,

no port on ILO is still 443, and other checks like --fan etc. are working finde. So whats the difference with --storage; is there a python or urllib3 timeout that triggers 308 and the 'wrong' redirect? Same setup works with an ILO5 system without any problems.

bb-Ricardo commented 1 year ago

can you check your environment? do you have a proxy configured?

bb-Ricardo commented 1 year ago

Closing due to almost 3 month of inactivity.