dell / csm

Dell Container Storage Modules (CSM)
Apache License 2.0
70 stars 15 forks source link

[QUESTION]: CSI reverseproxy not installed properly #1429

Closed anandhg02 closed 1 week ago

anandhg02 commented 2 months ago

Deploying CSM Operator for a Source and Target OCP 4.14 connected to replicated (SRDF/S) PowerMAX at respective site.

As mentioned in CSM Replication Module document section > "7. Install the CSI driver for your chosen storage platform on the source cluster", I had executed _oc create -f storage_csm_powermaxv291.yaml. at the source cluster.

The Pods @ source site (csipowermax-reverseproxy, powermax-controller, powermax-node) all started normally but on the target site I couldn't see the csipowermax-reverseproxy pod created and the powermax-controller, powermax-node are failing with the error "CSI reverseproxy service host or port not found, CSI reverseproxy not installed properly" in the driver container.

I hope I should Install the CSI driver only at source cluster and the required PODs are auto created at the target cluster. As I had noticed the controller and node pods are created except the reverseproxy at the target cluster.

Pods @ Source Cluster: NAME READY STATUS RESTARTS AGE csipowermax-reverseproxy-b6b9c9495-txpwr 1/1 Running 0 4h38m powermax-controller-58f94798cf-9nm4t 6/6 Running 0 4h38m powermax-controller-58f94798cf-k88ph 6/6 Running 0 4h38m powermax-node-dg6j4 2/2 Running 1 (4h37m ago) 4h38m powermax-node-tkbq7 2/2 Running 1 (4h37m ago) 4h38m powermax-node-zskw5 2/2 Running 1 (4h37m ago) 4h38m

Pods @ Target Cluster: NAME READY STATUS RESTARTS AGE powermax-controller-58f94798cf-h6xmt 1/6 CrashLoopBackOff 275 (91s ago) 4h40m powermax-controller-58f94798cf-jp54n 1/6 CrashLoopBackOff 275 (101s ago) 4h40m powermax-node-6mn65 0/2 CrashLoopBackOff 113 (102s ago) 4h40m powermax-node-tknjm 0/2 CrashLoopBackOff 113 (2m53s ago) 4h40m powermax-node-vvt74 0/2 CrashLoopBackOff 113 (117s ago) 4h40m

donatwork commented 2 months ago

Did you set DeployAsSidecar to true or false? Please try setting DeployAsSIdecar to true.

anandhg02 commented 2 months ago

Did you set DeployAsSidecar to true or false? Please try setting DeployAsSIdecar to true.

Where should this "DeployAsSIdecar " parameter be set? In the csireverseproxy config.yaml configmap? I don't see this parameter listed in the sample config.yaml as well. I can see the below statement mentioned in the config.yaml, is this the parameter to be updated?

mode: StandAlone # Mode for the reverseproxy, should not be changed

As I mentioned earlier the csipowermax-reverseproxy POD is getting installed successfully in the Prod OCP cluster, except the DR OCP cluster doesn't have this POD.

donatwork commented 2 months ago

The setting is here. There may be a problem with deployment of the sidecar on the remote cluster. We will investigate. Thanks.

donatwork commented 3 weeks ago

It is possible that step 8 is required? Are the remote host entries in DNS? Without seeing the pod logs for the target pods I cannot ascertain what the problem is. Also useful are the logs of the target namespaces.

anandhg02 commented 3 weeks ago

Hi Don, This environment has DNS and FQDN all configured properly. Since there was urgency on the project timelines, I went ahead with HELM based installation. But I am just curious is the Operator based installation fully tested and the reverseproxy installed successfully?

Probably the documentation also to be updated on how the process flow when installing the CSI Driver using Operator. I think when using Operator, you just install in one site and the installation is also done at target site automatically.

donatwork commented 3 weeks ago

Hi Anandh, I'm glad that you were able to work around the issue. We will investigate as soon as we can. Yes, when using Operator the install should install on the remote site as long as the kubeconfig is configured via repctl. Thanks.

jooseppi-luna commented 1 week ago

@anandhg02 -- based on our investigation, we believe this error might have been due to the fact that the operator is deploying a reverseproxy pod on the target cluster, but not the associated service (kubectl get service -n <driver namespace>). The PR linked above has documentation updates that give a workaround. If you do try operator install again, definitely reference the above and let us know if it resolves your issue! We will be working to make this installation smoother in the future.

anandhg02 commented 1 week ago

Thank You @jooseppi-luna. Sure I'll follow the workaround for any future deployment.

Hope the issue will be fixed in the upcoming CSM release.