Open stranljip opened 2 years ago
On the clusters where the CSI driver failed to allow creating new PVs after the expiration time of the token, port 5392 of the storage was not open. It might be that this port needs to be open in order to allow refreshing the token/trigger a relogin although I could not find anything in the docs about that. After opening the port it seems that refreshing the token works.
@stranljip thanks for confirming. We are investigating if this is a bug or accidental feature.
After opening port 5392 for all clusters to the storage the driver works as expected. I would prefer if all communication goes through port 443 in future versions of Nimble OS, but at least the required ports should be documented clearly in SCOD.
Setup
Description of the Problem
After the initial deployment of the CSI driver, everything works as expected, new volumes can be created and used. After a time which exceeds either 70 minutes (this seems to be the expiration duration of the token as shown below) or the inactivity timeout (if set to a value lower than 70min set in the NimbleUI Administration -> Security -> Inactivity timeout) it is no longer possible to create new PVs while the excisting PVs continue to work. This behavior is easily reproducible by setting the timeout value to something like 5min and redeploying all CSI driver related pods (maybe not all are necessary, but this worked for us) or by waiting 70min and attempting to create a new PV/PVC. After longer investigation it seems that
What we observe
The creation of new PVs fail after either the inactivity timeout is triggered or after the token expires. This stays in this state until a redeploy of the CSI related pods. Triggering the creation and subsequent deletion of a PV/PVC every ~10mins seems to trigger the request for a new token, so that the system keeps working longer than 70mins.
What we expect
The CSI driver keeps the session running or triggers a reauthentication after the session has terminated (for whatever reasons). This must cover refreshing the auth token as well as getting a new token in a new session. For this it should be clear which port is used and it should preferrably only one port being used for the API access. At the very least, the port requirements must be described in SCOD
Additional information
Logs: Logs of the failing state - works-not.txt Logs of the working state - works.txt
HPE Support case number
04941823
Authentication Token - expiration duration of 4200sec/70min deduced from
expiry_time
andcreation_time