ceph / ceph-csi

CSI driver for Ceph
Apache License 2.0
1.27k stars 539 forks source link

Helm chart cant transmit the config changes #2713

Closed humpalu closed 2 years ago

humpalu commented 2 years ago

Describe the bug

The HELM variables not transmitted to the pod

Environment details

Steps to reproduce

Steps to reproduce the behavior:

helm install helm install -f values_csi_cephfs.yaml --namespace "ceph-csi-cephfs" "ceph-csi-cephfs" ceph-csi/ceph-csi-cephfs

Actual results

The mounts is in pending state. I logged into one of the provisioner container with the following command with troubleshooting purpose:

kubectl exec -i -t -n ceph-csi-cephfs ceph-csi-cephfs-provisioner-74fff76cc6-j62cb --container csi-cephfsplugin -- /bin/bash

I was surprised the following files wasnt updated

/etc/ceph/ceph.conf
/etc/ceph/ceph.client.admin.keyring

After I updated the files manually, the following command successfully mounted the remote share

ceph-fuse -m 10.xx.xx.xx:6789 /mnt/cephfs/application1/ --no-mon-config

Here is my values_csi_cephfs.yaml

---
rbac:
  # Specifies whether RBAC resources should be created
  create: true

serviceAccounts:
  nodeplugin:
    # Specifies whether a ServiceAccount should be created
    create: true
    # The name of the ServiceAccount to use.
    # If not set and create is true, a name is generated using the fullname
    name:
  provisioner:
    # Specifies whether a ServiceAccount should be created
    create: true
    # The name of the ServiceAccount to use.
    # If not set and create is true, a name is generated using the fullname
    name:

# Configuration for the CSI to connect to the cluster
# Ref: https://github.com/ceph/ceph-csi/blob/devel/examples/README.md
# Example:
# csiConfig:
#   - clusterID: "<cluster-id>"
#     monitors:
#       - "<MONValue1>"
#       - "<MONValue2>"
#     cephFS:
#       subvolumeGroup: "csi"
#csiConfig: []

csiConfig:
  - clusterID: "518b8dff-0cfe-4758-8aea-8ccf523bb89f1"
    monitors:
      - "10.xx.xx.yy:3300"
      - "10.xx.xx.zz:3300"
      - "10.xx.xx.xx:3300"
    cephFS:
      subvolumeGroup: "TestVol"

# Set logging level for csi containers.
# Supported values from 0 to 5. 0 for general useful logs,
# 5 for trace level verbosity.
logLevel: 5

nodeplugin:
  name: nodeplugin
  # if you are using ceph-fuse client set this value to OnDelete
  updateStrategy: RollingUpdate

  # set user created priorityclassName for csi plugin pods. default is
  # system-node-critical which is highest priority
  priorityClassName: system-node-critical

  httpMetrics:
    # Metrics only available for cephcsi/cephcsi => 1.2.0
    # Specifies whether http metrics should be exposed
    enabled: true
    # The port of the container to expose the metrics
    containerPort: 8081

    service:
      # Specifies whether a service should be created for the metrics
      enabled: true
      # The port to use for the service
      servicePort: 8080
      type: ClusterIP

      # Annotations for the service
      # Example:
      # annotations:
      #   prometheus.io/scrape: "true"
      #   prometheus.io/port: "9080"
      annotations: {}

      clusterIP: ""

      ## List of IP addresses at which the stats-exporter service is available
      ## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips
      ##
      externalIPs: []

      loadBalancerIP: ""
      loadBalancerSourceRanges: []

  profiling:
    enabled: false

  registrar:
    image:
      repository: k8s.gcr.io/sig-storage/csi-node-driver-registrar
      tag: v2.3.0
      pullPolicy: IfNotPresent
    resources: {}

  plugin:
    image:
      repository: quay.io/cephcsi/cephcsi
      tag: canary
      pullPolicy: IfNotPresent
    resources: {}

  nodeSelector: {}

  tolerations: []

  affinity: {}

  # Set to true to enable Ceph Kernel clients
  # on kernel < 4.17 which support quotas
  # forcecephkernelclient: true

  # If true, create & use Pod Security Policy resources
  # https://kubernetes.io/docs/concepts/policy/pod-security-policy/
  podSecurityPolicy:
    enabled: false

provisioner:
  name: provisioner
  replicaCount: 3
  strategy:
    # RollingUpdate strategy replaces old pods with new ones gradually,
    # without incurring downtime.
    type: RollingUpdate
    rollingUpdate:
      # maxUnavailable is the maximum number of pods that can be
      # unavailable during the update process.
      maxUnavailable: 50%
  # Timeout for waiting for creation or deletion of a volume
  timeout: 60s

  # set user created priorityclassName for csi provisioner pods. default is
  # system-cluster-critical which is less priority than system-node-critical
  priorityClassName: system-cluster-critical

  httpMetrics:
    # Metrics only available for cephcsi/cephcsi => 1.2.0
    # Specifies whether http metrics should be exposed
    enabled: true
    # The port of the container to expose the metrics
    containerPort: 8081

    service:
      # Specifies whether a service should be created for the metrics
      enabled: true
      # The port to use for the service
      servicePort: 8080
      type: ClusterIP

      # Annotations for the service
      # Example:
      # annotations:
      #   prometheus.io/scrape: "true"
      #   prometheus.io/port: "9080"
      annotations: {}

      clusterIP: ""

      ## List of IP addresses at which the stats-exporter service is available
      ## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips
      ##
      externalIPs: []

      loadBalancerIP: ""
      loadBalancerSourceRanges: []

  profiling:
    enabled: false

  provisioner:
    image:
      repository: k8s.gcr.io/sig-storage/csi-provisioner
      tag: v3.0.0
      pullPolicy: IfNotPresent
    resources: {}

  attacher:
    name: attacher
    enabled: true
    image:
      repository: k8s.gcr.io/sig-storage/csi-attacher
      tag: v3.3.0
      pullPolicy: IfNotPresent
    resources: {}

  resizer:
    name: resizer
    enabled: true
    image:
      repository: k8s.gcr.io/sig-storage/csi-resizer
      tag: v1.3.0
      pullPolicy: IfNotPresent
    resources: {}

  snapshotter:
    image:
      repository: k8s.gcr.io/sig-storage/csi-snapshotter
      tag: v4.2.0
      pullPolicy: IfNotPresent
    resources: {}

  nodeSelector: {}

  tolerations: []

  affinity: {}

  # If true, create & use Pod Security Policy resources
  # https://kubernetes.io/docs/concepts/policy/pod-security-policy/
  podSecurityPolicy:
    enabled: false

topology:
  # Specifies whether topology based provisioning support should
  # be exposed by CSI
  enabled: false
  # domainLabels define which node labels to use as domains
  # for CSI nodeplugins to advertise their domains
  # NOTE: the value here serves as an example and needs to be
  # updated with node labels that define domains of interest
  domainLabels:
    - failure-domain/region
    - failure-domain/zone

storageClass:
  # Specifies whether the Storage class should be created
  create: true
  name: csi-cephfs-sc
  # Annotations for the storage class
  # Example:
  # annotations:
  #   storageclass.kubernetes.io/is-default-class: "true"
  annotations: {}

  # String representing a Ceph cluster to provision storage from.
  # Should be unique across all Ceph clusters in use for provisioning,
  # cannot be greater than 36 bytes in length, and should remain immutable for
  # the lifetime of the StorageClass in use.
  clusterID: 518b8dff-0cfe-4758-8aea-8ccf523bb89f1
  # (required) CephFS filesystem name into which the volume shall be created
  # eg: fsName: myfs
  fsName: cephfs
# 518b8dff-0cfe-4758-8aea-8ccf523bb89f
  # (optional) Ceph pool into which volume data shall be stored
  pool: cephfs_data
  # For eg:
  # pool: "replicapool"
  #  pool: ""
  # (optional) Comma separated string of Ceph-fuse mount options.
  # For eg:
  # fuseMountOptions: debug
  #fuseMountOptions: ""
  # (optional) Comma separated string of Cephfs kernel mount options.
  # Check man mount.ceph for mount options. For eg:
  # kernelMountOptions: readdir_max_bytes=1048576,norbytes
 # kernelMountOptions: ""
  # (optional) The driver can use either ceph-fuse (fuse) or
  # ceph kernelclient (kernel).
  # If omitted, default volume mounter will be used - this is
  # determined by probing for ceph-fuse and mount.ceph
  # mounter: kernel
  mounter: fuse
  # (optional) Prefix to use for naming subvolumes.
  # If omitted, defaults to "csi-vol-".
  # volumeNamePrefix: "foo-bar-"
#  volumeNamePrefix: ""
  # The secrets have to contain user and/or Ceph admin credentials.
  provisionerSecret: csi-cephfs-secret
  # If the Namespaces are not specified, the secrets are assumed to
  # be in the Release namespace.
  provisionerSecretNamespace: ""
  controllerExpandSecret: csi-cephfs-secret
  controllerExpandSecretNamespace: ""
  nodeStageSecret: csi-cephfs-secret
  nodeStageSecretNamespace: ""
  reclaimPolicy: Delete
  allowVolumeExpansion: true
  mountOptions: []
  # Mount Options
  # Example:
  # mountOptions:
  #   - discard

secret:
  # Specifies whether the secret should be created
  create: true
  name: csi-cephfs-secret
  # Key values correspond to a user name and its key, as defined in the
  # ceph cluster. User ID should have required access to the 'pool'
  # specified in the storage class
  adminID: admin
  adminKey: "XXX"
# This is a sample configmap that helps define a Ceph configuration as required
# by the CSI plugins.
# Sample ceph.conf available at
# https://github.com/ceph/ceph/blob/master/src/sample.ceph.conf Detailed
# documentation is available at
# https://docs.ceph.com/en/latest/rados/configuration/ceph-conf/
cephconf: |
  [client.rgw.ceprgw01]
  rgw dns name = objects.dev.xx
  rgw num rados handles = 1
  rgw realm = xx
  rgw thread pool size = 2048
  rgw zone = moi
  rgw zonegroup = poc

  [client.rgw.ceprgw02]
  rgw dns name = objects.dev.xx
  rgw num rados handles = 1
  rgw realm = xx
  rgw thread pool size = 2048
  rgw zone = moi
  rgw zonegroup = poc
  # Please do not change this file directly since it is managed by Ansible and will be overwritten
  [global]
  cluster network = xx.xx.xx.xx/28,yy.yy.yy.yy/28
  fsid = 518b8dff-0cfe-4758-8aea-8ccf523bb89f
  mon host = [v2:1xx.xx.xx.xx:3300,v1:xx.xx.xx.xx:6789],[v2:xx.xx.xx.yy:3300,v1:xx.xx.xx.yy:6789],[v2:xx.xx.xx.zz:3300,v1:xx.xx.xx.zz:6789]
  mon initial members = cemont01,cemont02,cemont03
  mon max pg per osd = 400
  mon osd down out interval = 28800
  mon pg warn max object skew = -1
  osd pool default crush rule = -1
  osd pool default min size = 2
  osd pool default size = 3
  public network = xxx
  rgw dynamic resharding = False
  rocksdb cache size = 2147483648

  [mds]
  mds cache memory limit = 109951162778

  [mds.cepmds01]
  mds standby for rank = 0
  mds standby replay = True

  [mds.cepmds02]
  mds standby for rank = 0
  mds standby replay = True

  [osd]
  bluestore cache autotune = 0
  bluestore cache kv ratio = 0.2
  bluestore cache meta ratio = 0.8
  bluestore cache size ssd = 8G
  bluestore csum type = none
  bluestore extent map shard max size = 200
  bluestore extent map shard min size = 50
  bluestore extent map shard target size = 100
  bluestore min alloc size ssd = 4096
  bluestore rocksdb options = compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,compaction_style=kCompactionStyleLevel,write_buffer_size=67108864,target_file_size_base=67108864,max_background_compactions=31,level0_file_num_compaction_trigger=8,level0_slowdown_writes_trigger=32,level0_stop_writes_trigger=64,max_bytes_for_level_base=536870912,compaction_threads=32,max_bytes_for_level_multiplier=8,flusher_threads=8,compaction_readahead_size=2MB
  ms async max op threads = 10
  ms async op threads = 6
  osd client watch timeout = 15
  osd deep scrub interval = 604800
  osd heartbeat grace = 20
  osd heartbeat interval = 5
  osd memory target = 12884901888
  osd op num threads per shard hdd = 8
  osd op num threads per shard ssd = 8
  osd scrub begin hour = 21
  osd scrub end hour = 6
  osd scrub max interval = 604800
  osd scrub min interval = 259200

#########################################################
# Variables for 'internal' use please use with caution! #
#########################################################

# The filename of the provisioner socket
provisionerSocketFile: csi-provisioner.sock
# The filename of the plugin socket
pluginSocketFile: csi.sock
# kubelet working directory,can be set using `--root-dir` when starting kubelet.
kubeletDir: /var/lib/kubelet
# Name of the csi-driver
driverName: cephfs.csi.ceph.com
# Name of the configmap used for state
configMapName: ceph-csi-config
# Key to use in the Configmap if not config.json
# configMapKey:
# Use an externally provided configmap
externallyManagedConfigmap: false

Error in the provision pod if I want to create the PVC. The standard pod restart doesn't help


E1215 11:06:17.457773       1 controller.go:956] error syncing claim "8d8e71bc-9040-4355-a9b1-74d6990262f5": failed to provision volume with StorageClass "csi-cephfs-sc": rpc error: code = Aborted desc = an operation with the given Volume ID pvc-8d8e71bc-9040-4355-a9b1-74d6990262f5 already exists
I1215 11:06:17.457801       1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"csi-vol-cephfs", UID:"8d8e71bc-9040-4355-a9b1-74d6990262f5", APIVersion:"v1", ResourceVersion:"34543832", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "csi-cephfs-sc": rpc error: code = Aborted desc = an operation with the given Volume ID pvc-8d8e71bc-9040-4355-a9b1-74d6990262f5 already exists

Expected behavior

pvc creation is working

Logs

If the issue is in PVC creation, deletion, cloning please attach complete logs of below containers.

Madhu-1 commented 2 years ago

@tothger can you please exec into the csi cephfsplugin container and paste the /etc/ceph/ceph.conf contents?

Madhu-1 commented 2 years ago

10.xx.xx.yy:3300

instead of v2 can you see using 6789 (v1) port helps?

humpalu commented 2 years ago

@Madhu-1

Workaround for http://tracker.ceph.com/issues/23446

fuse_set_user_groups = false

ceph-fuse which uses libfuse2 by default has write buffer size of 2KiB

adding 'fuse_big_writes = true' option by default to override this limit

see https://github.com/ceph/ceph-csi/issues/1928

fuse_big_writes = true



- No, If I change the port to 6789 it still not working
- I still not understand, if I change the /etc/ceph/ceph.conf to the content in the csi cephfsplugin container  above the fuse-mount is working from the container. (ofc with the addition of the admin keyring)
github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

Madhu-1 commented 2 years ago

@tothger i have verified in canary local helm charts its working fine. please use 3.5.0 helm charts as we don't have support for custom ceph.conf in 3.4.0 release

[🎩︎]mrajanna@fedora ceph-csi $]kubectl get ns
NAME              STATUS   AGE
ceph-csi-cephfs   Active   82s
default           Active   2m7s
kube-node-lease   Active   2m9s
kube-public       Active   2m9s
kube-system       Active   2m9s
[🎩︎]mrajanna@fedora ceph-csi $]kubectl get po -nceph-csi-cephfs
NAME                                                      READY   STATUS              RESTARTS   AGE
ceph-csi-cephfs-1642755305-nodeplugin-5zn7m               3/3     Running             0          73s
ceph-csi-cephfs-1642755305-provisioner-66fb946f78-4xwp4   0/6     Pending             0          73s
ceph-csi-cephfs-1642755305-provisioner-66fb946f78-kcdlk   0/6     ContainerCreating   0          73s
ceph-csi-cephfs-1642755305-provisioner-66fb946f78-wj6gn   0/6     Pending             0          73s
[🎩︎]mrajanna@fedora ceph-csi $]kubectl exec -it ceph-csi-cephfs-1642755305-nodeplugin-5zn7m -c csi-cephfsplugin -nceph-csi-cephfs -- sh
sh-4.4# cat /etc/ceph/ceph.conf 
[client.rgw.ceprgw01]
rgw dns name = objects.dev.xx
rgw num rados handles = 1
rgw realm = xx
rgw thread pool size = 2048
rgw zone = moi
rgw zonegroup = poc

[client.rgw.ceprgw02]
rgw dns name = objects.dev.xx
rgw num rados handles = 1
rgw realm = xx
rgw thread pool size = 2048
rgw zone = moi
rgw zonegroup = poc
# Please do not change this file directly since it is managed by Ansible and will be overwritten
[global]
cluster network = xx.xx.xx.xx/28,yy.yy.yy.yy/28
fsid = 518b8dff-0cfe-4758-8aea-8ccf523bb89f
mon host = [v2:1xx.xx.xx.xx:3300,v1:xx.xx.xx.xx:6789],[v2:xx.xx.xx.yy:3300,v1:xx.xx.xx.yy:6789],[v2:xx.xx.xx.zz:3300,v1:xx.xx.xx.zz:6789]
mon initial members = cemont01,cemont02,cemont03
mon max pg per osd = 400
mon osd down out interval = 28800
mon pg warn max object skew = -1
osd pool default crush rule = -1
osd pool default min size = 2
osd pool default size = 3
public network = xxx
rgw dynamic resharding = False
rocksdb cache size = 2147483648

[mds]
mds cache memory limit = 109951162778

[mds.cepmds01]
mds standby for rank = 0
mds standby replay = True

[mds.cepmds02]
mds standby for rank = 0
mds standby replay = True

[osd]
bluestore cache autotune = 0
bluestore cache kv ratio = 0.2
bluestore cache meta ratio = 0.8
bluestore cache size ssd = 8G
bluestore csum type = none
bluestore extent map shard max size = 200
bluestore extent map shard min size = 50
bluestore extent map shard target size = 100
bluestore min alloc size ssd = 4096
bluestore rocksdb options = compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,compaction_style=kCompactionStyleLevel,write_buffer_size=67108864,target_file_size_base=67108864,max_background_compactions=31,level0_file_num_compaction_trigger=8,level0_slowdown_writes_trigger=32,level0_stop_writes_trigger=64,max_bytes_for_level_base=536870912,compaction_threads=32,max_bytes_for_level_multiplier=8,flusher_threads=8,compaction_readahead_size=2MB
ms async max op threads = 10
ms async op threads = 6
osd client watch timeout = 15
osd deep scrub interval = 604800
osd heartbeat grace = 20
osd heartbeat interval = 5
osd memory target = 12884901888
osd op num threads per shard hdd = 8
osd op num threads per shard ssd = 8
osd scrub begin hour = 21
osd scrub end hour = 6
osd scrub max interval = 604800
osd scrub min interval = 259200
sh-4.4# 
sh-4.4# 
sh-4.4# exit
exit
[🎩︎]mrajanna@fedora ceph-csi $]kubectl get cm ceph-cong ^C
[🎩︎]mrajanna@fedora ceph-csi $]kubectl get cm -nceph-csi-cephfs
NAME               DATA   AGE
ceph-config        2      2m17s
ceph-csi-config    1      2m17s
kube-root-ca.crt   1      2m29s
[🎩︎]mrajanna@fedora ceph-csi $]kubectl get cm ceph-config -nceph-csi-cephfs -oyaml
apiVersion: v1
data:
  ceph.conf: |
    [client.rgw.ceprgw01]
    rgw dns name = objects.dev.xx
    rgw num rados handles = 1
    rgw realm = xx
    rgw thread pool size = 2048
    rgw zone = moi
    rgw zonegroup = poc

    [client.rgw.ceprgw02]
    rgw dns name = objects.dev.xx
    rgw num rados handles = 1
    rgw realm = xx
    rgw thread pool size = 2048
    rgw zone = moi
    rgw zonegroup = poc
    # Please do not change this file directly since it is managed by Ansible and will be overwritten
    [global]
    cluster network = xx.xx.xx.xx/28,yy.yy.yy.yy/28
    fsid = 518b8dff-0cfe-4758-8aea-8ccf523bb89f
    mon host = [v2:1xx.xx.xx.xx:3300,v1:xx.xx.xx.xx:6789],[v2:xx.xx.xx.yy:3300,v1:xx.xx.xx.yy:6789],[v2:xx.xx.xx.zz:3300,v1:xx.xx.xx.zz:6789]
    mon initial members = cemont01,cemont02,cemont03
    mon max pg per osd = 400
    mon osd down out interval = 28800
    mon pg warn max object skew = -1
    osd pool default crush rule = -1
    osd pool default min size = 2
    osd pool default size = 3
    public network = xxx
    rgw dynamic resharding = False
    rocksdb cache size = 2147483648

    [mds]
    mds cache memory limit = 109951162778

    [mds.cepmds01]
    mds standby for rank = 0
    mds standby replay = True

    [mds.cepmds02]
    mds standby for rank = 0
    mds standby replay = True

    [osd]
    bluestore cache autotune = 0
    bluestore cache kv ratio = 0.2
    bluestore cache meta ratio = 0.8
    bluestore cache size ssd = 8G
    bluestore csum type = none
    bluestore extent map shard max size = 200
    bluestore extent map shard min size = 50
    bluestore extent map shard target size = 100
    bluestore min alloc size ssd = 4096
    bluestore rocksdb options = compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,compaction_style=kCompactionStyleLevel,write_buffer_size=67108864,target_file_size_base=67108864,max_background_compactions=31,level0_file_num_compaction_trigger=8,level0_slowdown_writes_trigger=32,level0_stop_writes_trigger=64,max_bytes_for_level_base=536870912,compaction_threads=32,max_bytes_for_level_multiplier=8,flusher_threads=8,compaction_readahead_size=2MB
    ms async max op threads = 10
    ms async op threads = 6
    osd client watch timeout = 15
    osd deep scrub interval = 604800
    osd heartbeat grace = 20
    osd heartbeat interval = 5
    osd memory target = 12884901888
    osd op num threads per shard hdd = 8
    osd op num threads per shard ssd = 8
    osd scrub begin hour = 21
    osd scrub end hour = 6
    osd scrub max interval = 604800
    osd scrub min interval = 259200
  keyring: ""
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: ceph-csi-cephfs-1642755305
    meta.helm.sh/release-namespace: ceph-csi-cephfs
  creationTimestamp: "2022-01-21T08:55:05Z"
  labels:
    app: ceph-csi-cephfs
    app.kubernetes.io/managed-by: Helm
    chart: ceph-csi-cephfs-3-canary
    component: nodeplugin
    heritage: Helm
    release: ceph-csi-cephfs-1642755305
  name: ceph-config
  namespace: ceph-csi-cephfs
  resourceVersion: "525"
  uid: 07bf3299-07aa-4960-a0fa-1de1908c2562