hpe-storage / csi-driver

A Container Storage Interface (CSI) driver from HPE
https://scod.hpedev.io
Apache License 2.0
62 stars 57 forks source link

CSI driver needs port 443 and 5392 in order to allow refreshing the token/relogin - documentation/consolidation needed #308

Open stranljip opened 2 years ago

stranljip commented 2 years ago

Setup

Description of the Problem

After the initial deployment of the CSI driver, everything works as expected, new volumes can be created and used. After a time which exceeds either 70 minutes (this seems to be the expiration duration of the token as shown below) or the inactivity timeout (if set to a value lower than 70min set in the NimbleUI Administration -> Security -> Inactivity timeout) it is no longer possible to create new PVs while the excisting PVs continue to work. This behavior is easily reproducible by setting the timeout value to something like 5min and redeploying all CSI driver related pods (maybe not all are necessary, but this worked for us) or by waiting 70min and attempting to create a new PV/PVC. After longer investigation it seems that

What we observe

The creation of new PVs fail after either the inactivity timeout is triggered or after the token expires. This stays in this state until a redeploy of the CSI related pods. Triggering the creation and subsequent deletion of a PV/PVC every ~10mins seems to trigger the request for a new token, so that the system keeps working longer than 70mins.

What we expect

The CSI driver keeps the session running or triggers a reauthentication after the session has terminated (for whatever reasons). This must cover refreshing the auth token as well as getting a new token in a new session. For this it should be clear which port is used and it should preferrably only one port being used for the API access. At the very least, the port requirements must be described in SCOD

Additional information

Logs: Logs of the failing state - works-not.txt Logs of the working state - works.txt

HPE Support case number 04941823

Authentication Token - expiration duration of 4200sec/70min deduced from expiry_time and creation_time

{
   "data":[
      {
         "id":"19532e249f862567d3000000000000000000007ad4",
         "session_token":"****",
         "username":"general1",
         "app_name":"",
         "source_ip":"127.0.0.1",
         "creation_time":1646892911,
         "last_modified":1646895311,
         "expiry_time":1646897111
      }
   ],
   "start_row":0,
   "end_row":0,
   "total_rows":0
}
stranljip commented 2 years ago

On the clusters where the CSI driver failed to allow creating new PVs after the expiration time of the token, port 5392 of the storage was not open. It might be that this port needs to be open in order to allow refreshing the token/trigger a relogin although I could not find anything in the docs about that. After opening the port it seems that refreshing the token works.

datamattsson commented 2 years ago

@stranljip thanks for confirming. We are investigating if this is a bug or accidental feature.

stranljip commented 2 years ago

After opening port 5392 for all clusters to the storage the driver works as expected. I would prefer if all communication goes through port 443 in future versions of Nimble OS, but at least the required ports should be documented clearly in SCOD.