IBM / ibm-spectrum-scale-csi

The IBM Spectrum Scale Container Storage Interface (CSI) project enables container orchestrators, such as Kubernetes and OpenShift, to manage the life-cycle of persistent storage.
Apache License 2.0
66 stars 49 forks source link

Failed to authenticate to GUI server and errors in operator log #1040

Closed superleo closed 8 months ago

superleo commented 1 year ago

Describe the bug

Failed to authenticate to GUI server and errors in operator log

How to Reproduce?

  1. try to manual curl the url and success:
# curl --insecure -u 'user:password' -X GET https://192.168.3.211:30443/scalemgmt/v2/cluster
{
  "cluster" : {
    "clusterSummary" : {
      "clusterId" : 1052530068846563007,
      "clusterName" : "master1.cluster.local",
      "primaryServer" : "master1.cluster.local",
      "rcpPath" : "/usr/bin/scp",
      "rcpSudoWrapper" : false,
      "repositoryType" : "CCR",
      "rshPath" : "/usr/bin/ssh",
      "rshSudoWrapper" : false,
      "uidDomain" : "master1.cluster.local"
    }
  },
  "status" : {
    "code" : 200,
    "message" : "The request finished successfully."
  }
  1. Checked the secret by decode with base64 it's ok. checked the yaml and had set ssl to false.
  2. In the operator log it always authentication fail
2023-10-08T08:17:30.113Z    INFO    csiscaleoperator_controller.checkPrerequisite   Secret resource gpfs101sceret found.
2023-10-08T08:17:30.113Z    INFO    csiscaleoperator_controller.Reconcile   Pre-requisite check passed.
2023-10-08T08:17:30.113Z    INFO    csiscaleoperator_controller.ValidateCRParams    Validating the Spectrum Scale CSI configurations of the resource CSIScaleOperator/ibm-spectrum-scale-csi
2023-10-08T08:17:30.113Z    INFO    csiscaleoperator_controller.Reconcile   The Spectrum Scale CSI configurations are validated successfully
2023-10-08T08:17:30.113Z    INFO    csiscaleoperator_controller.handleSpectrumScaleConnectors   Checking spectrum scale connectors
2023-10-08T08:17:30.113Z    INFO    csiscaleoperator_controller.newSpectrumScaleConnector   Creating new SpectrumScaleConnector for cluster with    {"ID": "1052530068846563007"}
2023-10-08T08:17:30.113Z    INFO    csiscaleoperator_controller.newSpectrumScaleConnector   Created Spectrum Scale connector without SSL mode for guiHost(s)
E1008 08:18:30.113987       1 rest_v2.go:1024] [] rest_v2 doHTTP: Error in authentication request on endpoint https://192.168.3.211:30443/: Get "https://192.168.3.211:30443/scalemgmt/v2/cluster": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
E1008 08:18:30.114047       1 rest_v2.go:193] [] Unable to get cluster ID: Get "https://192.168.3.211:30443/scalemgmt/v2/cluster": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2023-10-08T08:18:30.114Z    ERROR   csiscaleoperator_controller.handleSpectrumScaleConnectors   Failed to connect to the GUI of the cluster with ID: 1052530068846563007    {"error": "Get \"https://192.168.3.211:30443/scalemgmt/v2/cluster\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}
github.com/IBM/ibm-spectrum-scale-csi/operator/controllers.(*CSIScaleOperatorReconciler).Reconcile
    /workspace/controllers/csiscaleoperator_controller.go:296
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2023-10-08T08:18:30.114Z    ERROR   csiscaleoperator_controller.Reconcile   Error in getting connectors {"error": "Get \"https://192.168.3.211:30443/scalemgmt/v2/cluster\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2023-10-08T08:18:30.114Z    INFO    csiscaleoperator_controller.SetStatus   Assigning values to status sub-resource object.
2023-10-08T08:18:30.114Z    DEBUG   csiscaleoperator_controller.SetStatus   Setting status of CSIScaleOperator is successful
2023-10-08T08:18:30.114Z    DEBUG   events  Warning {"object": {"kind":"CSIScaleOperator","namespace":"ibm-spectrum-scale-csi-driver","name":"ibm-spectrum-scale-csi","uid":"3d2cf54e-2113-428c-a048-8a34944b99bb","apiVersion":"csi.ibm.com/v1","resourceVersion":"14348088"}, "reason": "GUIConnFailed", "message": "Failed to connect to the GUI of the cluster with ID: 1052530068846563007"}
2023-10-08T08:18:30.131Z    DEBUG   csiscaleoperator_controller.Reconcile   Updated resource status.    {"Status": {"versions":[{"name":"ibm-spectrum-scale-csi","version":"2.9.0"}],"conditions":[{"type":"Success","status":"False","lastTransitionTime":"2023-10-08T08:18:30Z","reason":"GUIConnFailed","message":"Failed to connect to the GUI of the cluster with ID: 1052530068846563007"}]}}
2023-10-08T08:18:30.131Z    ERROR   controller.csiscaleoperator Reconciler error    {"reconciler group": "csi.ibm.com", "reconciler kind": "CSIScaleOperator", "name": "ibm-spectrum-scale-csi", "namespace": "ibm-spectrum-scale-csi-driver", "error": "Get \"https://192.168.3.211:30443/scalemgmt/v2/cluster\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227

Expected behavior

A clear and concise description of what you expected to happen.

Data Collection and Debugging

Environmental output

Tool to collect the CSI snap:

./tools/spectrum-scale-driver-snap.sh -n < csi driver namespace>

Screenshots

If applicable, add screenshots to help explain your problem.

Additional context

Add any other context about the problem here.

Add labels

Note : See labels for the labels

deeghuge commented 1 year ago

Hi @superleo , The most likely issue is gui host is not reachable from pod. Please do check the pod to guihost network communication if that is working.

superleo commented 1 year ago

Hi @superleo , The most likely issue is gui host is not reachable from pod. Please do check the pod to guihost network communication if that is working.

Thank you @deeghuge , I tried to ping the pod ip and the network from host to operator pod is ok. And from the log file I can see it's "authentication problem", not network problem.

superleo commented 1 year ago

I created a dnsutil pod in the namspace and could ping host GUI server IP as well

root@master1:/home/mccxadmin# kubectl get pods -n ibm-spectrum-scale-csi-driver
NAME                                              READY   STATUS    RESTARTS   AGE
dnsutils                                          1/1     Running   0          11s
ibm-spectrum-scale-csi-operator-bddd6d8f7-wjhp7   1/1     Running   0          27h
root@master1:/home/mccxadmin# kubectl exec -i -t dnsutils -n ibm-spectrum-scale-csi-driver -- ping 192.168.3.211
PING 192.168.3.211 (192.168.3.211) 56(84) bytes of data.
64 bytes from 192.168.3.211: icmp_seq=1 ttl=64 time=0.167 ms
64 bytes from 192.168.3.211: icmp_seq=2 ttl=64 time=0.090 ms
64 bytes from 192.168.3.211: icmp_seq=3 ttl=64 time=0.097 ms
64 bytes from 192.168.3.211: icmp_seq=4 ttl=64 time=0.088 ms
c64 bytes from 192.168.3.211: icmp_seq=5 ttl=64 time=0.095 ms
64 bytes from 192.168.3.211: icmp_seq=6 ttl=64 time=0.315 ms
superleo commented 1 year ago

I notice the host is listen on 47443 port. After modify the guiport in csi driver yaml file to 47443 and it's ok now. But don't know why, I cannot find any configuration related with 47443 port.

deeghuge commented 10 months ago

@superleo good to see that things are working for you. The question - I cannot find any configuration related with 47443 port by this do you mean that there is no documentation available on how and where to change the port or something else ?

deeghuge commented 8 months ago

Closing since problem was resolved. Do reopen if there are any other concerns in same area