ThinkParQ / beegfs-csi-driver

The BeeGFS Container Storage Interface (CSI) driver provides high performing and scalable storage for workloads running in Kubernetes. 📦 🐝
Apache License 2.0
65 stars 18 forks source link

base64 for `ConnAuthConfig.ConnAuth` #13

Closed zz913922 closed 1 year ago

zz913922 commented 1 year ago

Our conn_auth is random binary bytes, and I didn't find a way to put it in the csi-beegfs-connauth.yaml file. Is there any possible to support base64 encoding for this field?

eg.

// ConnAuthConfig associates a ConnAuth with a SysMgmtdHost.
type ConnAuthConfig struct {
        SysMgmtdHost string `json:"sysMgmtdHost"`
        ConnAuth     string `json:"connAuth"`

        // Add new field for configuration
        Encoding string `json:"encoding"`
}

The Encoding field could be either raw or base64

zz913922 commented 1 year ago

I have multiple sysMgmtdHost, that's why I can't just use the default template configuration file.

ejweber commented 1 year ago

Thanks for opening this issue @zz913922!

I discussed this with the team this morning and we agree that it would be a good addition. While we generally use UTF encoded connAuthFiles (written out using a Linux text editor, for example), we recognize that that the BeeGFS docs recommend random bytes as you describe.

I played around for a little bit this afternoon, thinking that you could use the !!binary YAML tag to accomplish your goal. Since Go strings are just immutable byte slices, if we could correctly unmarshal the data into a Go string, we would write it out correctly by default. Unfortunately, the YAML parser we use does not support the !!binary tag and will not handle this correctly.

We probably should have required a base64 encoded value here to begin with. Now that we have a history of accepting UTF encoded values (our unmarshaler assumes it), your suggestion is a good one for maintaining backwards compatibility. This change will not make the v1.4.0 release (expected in about a week), but it is one of the first things we will look at doing for the followup release.

zz913922 commented 1 year ago

@ejweber Thanks for your so swift responding! This feature could absolutely help us a lot 👏

titansmc commented 1 year ago

Hi, Is there an option of passing connAuthFile to the deployment and I can manually maintain my binary based secret file? The issue is that we already deployed a binary secret on our BeeGFS infrastructure and now the k8s CSI is failing. What workaround can I use ?

ckrizek commented 1 year ago

Hi @titansmc, I’ve done a little bit of experimenting and think there is a workaround until we get these changes in. I tested deploying with the file location instead of utilizing the csi-beegfs-connauth.yaml and our workflows appear to work. I did the following.

  1. I created a connAuthFile utilizing the method outlined in BeeGFS Authentication and placed this file on all nodes that had the BeeGFS services and BeeGFS Clients and placed the path in all connAuthFile fields of the BeeGFS service's .conf files.

  2. In csi-beegfs-config.yaml I placed the following to specify the path for the connAuthFile.

fileSystemSpecificConfigs:
  - sysMgmtdHost: systemMgmtdHost-IP
    config:
      beegfsClientConf:
        connAuthFile: "/etc/beegfs/connAuthFile"
  1. I left the csi-beegfs-connauth.yaml blank.

I haven't tested all driver functionality for this, but I hope this is something you can utilize while we implement this fix.

titansmc commented 1 year ago

Thanks. I have tried what you suggested and didn't work.

pc error: code = Internal desc = beegfs-ctl failed with stdOut:  and stdErr: \nUnrecoverable error: No connAuthFile configured. Using BeeGFS without connection authentication is considered insecure and is not recommended. If you really want or need to run BeeGFS without connection authentication, please set connDisableAuthentication to true.

I placed the connAuthFile on all the nodes but I cannot check whether it is being mounted inside the CSI pod, if I do a describe, I still see this:

  config-dir:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      csi-beegfs-config-m9m7tb9hmg
    Optional:  false
  connauth-dir:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  csi-beegfs-connauth-2bcf9thg99
    Optional:    false

This is the configmap being mounted:

09:41 # kubectl get configmaps -n kube-system csi-beegfs-config-m9m7tb9hmg -oyaml
apiVersion: v1
data:
  csi-beegfs-config.yaml: |
    # Copyright 2021 NetApp, Inc. All Rights Reserved.
    # Licensed under the Apache License, Version 2.0.

    # Use this file as instructed in the General Configuration section of /docs/deployment.md. See
    # /deploy/k8s/examples/csi-beegfs-config.yaml for a worst-case example of what to put in this file. Kustomize will
    # automatically transform this file into a correct ConfigMap readable by the deployed driver. If this file is left
    # unmodified, the driver will deploy correctly with no custom configuration.
    fileSystemSpecificConfigs:
      - sysMgmtdHost: 10.11.12.90
        config:
          beegfsClientConf:
            connAuthFile: "/etc/beegfs/beegfs.secret"

and the connauth-dir is just a secret with a commented file:

What do you mean by laced the path in all connAuthFile fields of the BeeGFS service's .conf files. from point number 1 ? Do you mean to configure Beegfs servers? If so, this is already done and working, what I am doing now is to connect the kubernetes to the already working Beegfs cluster.

titansmc commented 1 year ago

so this is the file inside the container:

root@ops-k1n27 ~]# cat  /proc/3947/root/csi/config/csi-beegfs-config.yaml 
# Copyright 2021 NetApp, Inc. All Rights Reserved.
# Licensed under the Apache License, Version 2.0.

# Use this file as instructed in the General Configuration section of /docs/deployment.md. See
# /deploy/k8s/examples/csi-beegfs-config.yaml for a worst-case example of what to put in this file. Kustomize will
# automatically transform this file into a correct ConfigMap readable by the deployed driver. If this file is left
# unmodified, the driver will deploy correctly with no custom configuration.
fileSystemSpecificConfigs:
  - sysMgmtdHost: 10.11.12.90
    config:
      beegfsClientConf:
        connAuthFile: "/etc/beegfs/beegfs.secret"

but the problem is that /etc/beegfs/beegfs.secret doesn't get mounted inside it. Should I make it available inside the container? if so, how should I do it?

ckrizek commented 1 year ago

@titansmc Sorry that didn't work for you. Let me do a little more experimenting and discuss this with the team and get back with you. Sorry about the delay.

ejweber commented 1 year ago

Hello @titansmc. I think @ckrizek's suggestion sounds solid, so I'm a bit surprised it doesn't work. He and I will talk through it a bit more to understand the potential differences between his test environment and yours.

For some clarification, the driver executes beegfs-ctl in such a way that it essentially operates outside the driver container (this is part of the reason why beegfs-utils must be installed on all nodes). At the point where things are failing for you (assuming this is in volume creation), we should be attempting to execute something like "beegfs-ctl --cfgFile=/var/lib/kubelet/plugins/beefs.csi.netapp.com/__pvc- --unmounted --createdir ..." essentially directly on the host. The cfgFile at "/var/lib/kubelet..." should contain an absolute path to the connAuthfile you are wanting to use, and everything should just "work" without any extra mounts inside the driver container.

For us to pursue this further, it would probably be helpful to have some more logs. I'm assuming this is failing during volume creation, so the controller service logs are potentially interesting. At the default log level (3), we should at least get a message like "Executing command..." followed by the exact beegfs-ctl command that is being executed.

Can you provide the driver version and beegfs-client version you are using?

Please also confirm that the contents of csi-beegfs-connauth.yaml are empty for the overlay you are using. Specifying a connAuth in this file will cause the driver to write out a connAuthFile and override the connAuthFile path you are attempting to use for this workaround.

titansmc commented 1 year ago

The csi-beegfs-connauth.yaml file is commented out. I have manually edit /etc/beegfs/beegfs-client.conf and /etc/beegfs/beegfs-helperd.conf and edited connAuthFile = /etc/beegfs/beegfs.secret on every host, the secret is also present in every host. I am not using Beegfs for volume creation but only for mounting an existing PV/PVC that I have manually created.

16:03 # kubectl get pv beegfs-scratch-pv -oyaml
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/bound-by-controller: "yes"
  creationTimestamp: "2021-11-23T10:51:21Z"
  finalizers:
  - kubernetes.io/pv-protection
  name: beegfs-scratch-pv
  resourceVersion: "489706205"
  uid: 44d8d2a6-4743-4a86-b6e7-8a796261442b
spec:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 1238489897526886400m
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: beegfs-scratch-pvc
    namespace: jupyterhub
    resourceVersion: "489706203"
    uid: 2d93ddec-cfd2-4029-b34b-859a0a34c749
  csi:
    driver: beegfs.csi.netapp.com
    volumeHandle: beegfs://10.11.12.90/
  persistentVolumeReclaimPolicy: Retain
  volumeMode: Filesystem
status:
  phase: Bound

6:03 # kubectl get pvc -n jupyterhub beegfs-scratch-pvc -oyaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
  creationTimestamp: "2021-11-23T10:51:25Z"
  finalizers:
  - kubernetes.io/pvc-protection
  name: beegfs-scratch-pvc
  namespace: jupyterhub
  resourceVersion: "489706251"
  uid: 2d93ddec-cfd2-4029-b34b-859a0a34c749
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 1238489897526886400m
  storageClassName: ""
  volumeMode: Filesystem
  volumeName: beegfs-scratch-pv
status:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 1238489897526886400m
  phase: Bound

this I see it in the controller:

I0113 12:00:12.601035       1 config.go:63]  "msg"="Raw configuration parsed" "goroutine"="main" "parsePath"="/csi/config/csi-beegfs-config.yaml" "rawConfig"={"config":{},"fileSystemSpecificConfigs":[{"sysMgmtdHost":"10.11.12.90","config":{"beegfsClientConf":{"connAuthFile":"/etc/beegfs/beegfs.secret"}}}]}
I0113 12:00:12.601618       1 config.go:187]  "msg"="WARNING: Unsupported beegfs configuration option found and left in config" "goroutine"="main" "unsupportedOption"="connAuthFile" "unsupportedValue"="/etc/beegfs/beegfs.secret"

this is my layout:

✔ ~/k8s/ops-cluster/storage/beegfs-csi-driver [v1.4.0|…3] 
16:09 # cat deploy/k8s/overlays/its-ops/csi-beegfs-config.yaml 
# Copyright 2021 NetApp, Inc. All Rights Reserved.
# Licensed under the Apache License, Version 2.0.

# Use this file as instructed in the General Configuration section of /docs/deployment.md. See
# /deploy/k8s/examples/csi-beegfs-config.yaml for a worst-case example of what to put in this file. Kustomize will
# automatically transform this file into a correct ConfigMap readable by the deployed driver. If this file is left
# unmodified, the driver will deploy correctly with no custom configuration.
fileSystemSpecificConfigs:
  - sysMgmtdHost: 10.11.12.90
    config:
      beegfsClientConf:
        connAuthFile: "/etc/beegfs/beegfs.secret"
✔ ~/k8s/ops-cluster/storage/beegfs-csi-driver [v1.4.0|…3]

I run CentOS7 with Beegfs client v7.2.8 at this moment.

ejweber commented 1 year ago

Thanks @titansmc. Some of that is definitely helpful.

I was not aware you are using the static provisioning workflow (simply trying to mount pre-existing BeeGFS directories). In light of this, any suggestions I've given that specifically pertain to the controller service are likely erroneous.

The warning you see ("WARNING: Unsupported beegfs configuration option found and left in config" "goroutine"="main" "unsupportedOption"="connAuthFile" "unsupportedValue"="/etc/beegfs/beegfs.secret") is to be expected, and it is likely near the top of your node service logs as well. The driver typically overwrites the value of connAuthFile in the beegfs-client.conf it creates, so we consider setting that value in csi-beegfs-config.conf to be "unsupported" (though it should still work the way you are attempting to do it).

@ckrizek will do another recreate following the static provisioning workflow to ensure he can get it working in the way he originally suggested. In the meantime, there is a debugging workflow we'd follow on the driver development side that I can suggest if you're interested.

The driver node service writes a beegfs-client.conf (as well as other BeeGFS configuration files) for each PersistentVolume to a directory the Kubernetes node's file system. Depending on your version of Kubernetes, you can find these files at /var/lib/kubelet/pluging/kubernetes.io/csi/... While the driver is failing to mount the BeeGFS file system, you should be able to examine the actual PersistentVolume-specific beegfs-client.conf it is using. You can even try to use it to mount the BeeGFS file system yourself (or use beegfs-ctl) if you have sufficient privileges.

mount -t beegfs -ocfgFile=<absolute_path_to_cfgFile> beegfs_nodev /mnt/beegfs
# or
beegfs-ctl -ocfgFile=/etc/beegfs/<absolute_path_to_cfgFile> --getentryinfo /

The major questions are, "Does this beegfs-client.conf file correctly point to /etc/beegfs/connauth.secret?" and "Can this beegfs-client.conf file be used to mount BeeGFS outside of the driver attempting to do it itself?"