dell / csm

Dell Container Storage Modules (CSM)
Apache License 2.0
67 stars 15 forks source link

[BUG][cert-csi] : externalAccess doesn't take effect on nfs ephemeral export #945

Closed vincent1chen closed 9 months ago

vincent1chen commented 1 year ago

Bug Description

I fail nfs Ephemeral Inline Volume test. I use below resource file. the pod is always stuck in ContainerCreating status

kubectl get pods

NAME READY STATUS RESTARTS AGE powerstore-inline-volume 0/1 ContainerCreating 0 55s

here is pod event log:

Events: Type Reason Age From Message


Normal Scheduled 9m25s default-scheduler Successfully assigned default/powerstore-inline-volume to worker-0 Warning FailedMount 61s (x12 over 9m19s) kubelet MountVolume.SetUp failed for volume "volume" : rpc error: code = Internal desc = inline ephemeral node stage failed Warning FailedMount 32s (x4 over 7m23s) kubelet Unable to attach or mount volumes: unmounted volumes=[volume], unattached volumes=[volume kube-api-access-rknn7]: timed out waiting for the condition

to do further check, the issue is due to externalAccess don't add into ephermeral nfs export

image

controller node env include the "X_CSI_POWERSTORE_EXTERNAL_ACCESS"

image

simple_ephemeal_nfs.yaml : kind: Pod apiVersion: v1 metadata: name: powerstore-inline-volume spec: containers:

I create a normal nfs export and attach to a pod. the externalAccess is added into normal nfs export

image

simple_nfs.yaml

image image

Logs

.

Screenshots

No response

Additional Environment Information

No response

Steps to Reproduce

execute kubectl create -f tests/simple/simple_ephemeal_nfs.yaml

Expected Behavior

the pod could be create successfully

CSM Driver(s)

CSI powerstore 2.7

Installation Type

helm 3.0

Container Storage Modules Enabled

No response

Container Orchestrator

K8s 1.24.4

Operating System

debian 11

csmbot commented 1 year ago

@vincent1chen: Thank you for submitting this issue!

The issue is currently awaiting triage. Please make sure you have given us as much context as possible.

If the maintainers determine this is a relevant issue, they will remove the needs-triage label and assign an appropriate priority label.


We want your feedback! If you have any questions or suggestions regarding our contributing process/workflow, please reach out to us at container.storage.modules@dell.com.

vincent1chen commented 1 year ago

as per request, i download driver log on both controller and worker node. i also download log from provisioner

image

ephemeral nfs export created:
172.16.100.169:/ephemeral-csi-30ac1a9b17bf74e0a4cac2ddd48d7101dacbe3ad367746a5da5dd812c5f39514 image

controller2_driver.log controller_driver.log worker_driver.log

controller2_provisioner.log controller_provisioner.log

AkshaySainiDell commented 1 year ago

@vincent1chen Thanks for sharing the logs. It appears that there are two issues at stake here:

  1. External access feature not working with NFS Ephemeral
  2. Pod stuck in ContainerCreating with "inline ephemeral node stage failed" error message in Events

I could replicate issue 1 in my environment and I am currently working towards finding the root cause for this. Issue 2 on the other hand; seems to be related to the environment as the pod went into running state in my env. I'll review the logs and revert back with more details.

vincent1chen commented 1 year ago

As to the issue 2, I guess it's due to issue 2. If I manually add externalAccess subnet into ephemeral nfs. The pod will be running

image

kubectl describe pod powerstore-inline-volume

Name: powerstore-inline-volume Namespace: default Priority: 0 Service Account: default Node: worker-0/192.168.20.81 Start Time: Wed, 16 Aug 2023 10:51:30 -0400 Labels: Annotations: cni.projectcalico.org/containerID: fd4bf9adf276aba3edc7f66fec6cbd3fb60a9253d72f96cceb0ccf626b9dca45 cni.projectcalico.org/podIP: 172.16.43.31/32 cni.projectcalico.org/podIPs: 172.16.43.31/32 k8s.v1.cni.cncf.io/network-status: [{ "name": "chain", "ips": [ "172.16.43.31" ], "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: [{ "name": "chain", "ips": [ "172.16.43.31" ], "default": true, "dns": {} }] Status: Running IP: 172.16.43.31 IPs: IP: 172.16.43.31 Containers: test-container: Container ID: containerd://dc5d03ea175cf9fc9bf34f82f97933b50889338e6ab796bc93e75064f82db99c Image: registry.wr.lab/centos/centos:latest Image ID: @.**@.:dbbacecc49b088458781c16f3775f2a2ec7521079034a7ba499c8b0bb7f86875> Port: Host Port: Command: sleep 3600 State: Running Started: Wed, 16 Aug 2023 11:28:30 -0400 Ready: True Restart Count: 0 Environment: Mounts: /data from volume (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-2nb2t (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: volume: Type: CSI (a Container Storage Interface (CSI) volume source) Driver: csi-powerstore.dellemc.com FSType: nfs ReadOnly: false VolumeAttributes: nasName=WRCSI nfsAcls=0777 size=40Gi kube-api-access-2nb2t: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 30s node.kubernetes.io/unreachable:NoExecute op=Exists for 30s Events: Type Reason Age From Message


Normal Scheduled 38m default-scheduler Successfully assigned default/powerstore-inline-volume to worker-0 Warning FailedMount 31m (x2 over 34m) kubelet Unable to attach or mount volumes: unmounted volumes=[volume], unattached volumes=[kube-api-access-2nb2t volume]: timed out waiting for the condition Warning FailedMount 7m31s (x23 over 38m) kubelet MountVolume.SetUp failed for volume "volume" : rpc error: code = Internal desc = inline ephemeral node stage failed Warning FailedMount 2m23s (x14 over 36m) kubelet Unable to attach or mount volumes: unmounted volumes=[volume], unattached volumes=[volume kube-api-access-2nb2t]: timed out waiting for the condition Normal AddedInterface 84s multus Add eth0 [172.16.43.31/32] from chain

From: Akshay Saini @.> Sent: Wednesday, August 16, 2023 11:21 PM To: dell/csm @.> Cc: Chen, Yanfei @.>; Mention @.> Subject: Re: [dell/csm] [BUG][cert-csi] : externalAccess doesn't take effect on nfs ephemeral export (Issue #945)

[EXTERNAL EMAIL]

@vincent1chen [github.com]https://urldefense.com/v3/__https:/github.com/vincent1chen__;!!LpKI!jtVn3zrkhvuo8S7NVIxGxbvPN55DikBA5g5lfWUr3UQqo8Xw1A5Ga4g4WsJ8M_fYKJifNaMQcDE90WEGVzMFG_dAA5Q$ Thanks for sharing the logs. It appears that there are two issues at stake here:

  1. External access feature not working with NFS Ephemeral
  2. Pod stuck in ContainerCreating with "inline ephemeral node stage failed" error message in Events

I could replicate issue 1 in my environment and I am currently working towards finding the RCA for this. Issue 2 on the other hand; seems to be related to your environment as the pod went into running state in my env. I'll review the logs and revert back with more details.

- Reply to this email directly, view it on GitHub [github.com]https://urldefense.com/v3/__https:/github.com/dell/csm/issues/945*issuecomment-1680815081__;Iw!!LpKI!jtVn3zrkhvuo8S7NVIxGxbvPN55DikBA5g5lfWUr3UQqo8Xw1A5Ga4g4WsJ8M_fYKJifNaMQcDE90WEGVzMFArspeMk$, or unsubscribe [github.com]https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/BBSEVOBZ5SMX62GII3O5ZFLXVTQMHANCNFSM6AAAAAA3RKM5WE__;!!LpKI!jtVn3zrkhvuo8S7NVIxGxbvPN55DikBA5g5lfWUr3UQqo8Xw1A5Ga4g4WsJ8M_fYKJifNaMQcDE90WEGVzMFt3USPJk$. You are receiving this because you were mentioned.Message ID: @.**@.>>

Internal Use - Confidential

vincent1chen commented 1 year ago

controller1_provisioner_0817.log node_driver_0817.log @AkshaySainiDell here is the log of normal nfs creation

nfs created: 172.16.100.169:/wrcsivol-a85a7b9aaa [Uploading controller0_driver_0817.log…]()

image

controller1_provisioner_0817.log controller0_provisioner_0817.log

vincent1chen commented 1 year ago

controller1_driver_0817.log node_driver_0817.log

AkshaySainiDell commented 9 months ago

External Access feature was not added to nfs ephemeral volumes as they follow the Pod's lifetime and get created and deleted along with the Pod.