aws-samples / comfyui-on-eks

ComfyUI on AWS
MIT No Attribution
108 stars 17 forks source link

comfy pod can not mout pv error #12

Open lgb861213 opened 2 months ago

lgb861213 commented 2 months ago

when deploy the comfy deploy ,and the pod can not running. and the error log that is following: pod. 11m Warning FailedMount pod/comfyui-774b78d8c4-ss44v MountVolume.SetUp failed for volume "comfyui-outputs-pv" : rpc error: code = Internal desc = Could not mount "comfyui-outputs-112233445566-us-east-1" at "/var/lib/kubelet/pods/83046055-0ec9-40ec-8448-67afd9608fa0/volumes/kubernetes.io~csi/comfyui-outputs-pv/mount": Could not check if "/var/lib/kubelet/pods/83046055-0ec9-40ec-8448-67afd9608fa0/volumes/kubernetes.io~csi/comfyui-outputs-pv/mount" is a mount point: stat /var/lib/kubelet/pods/83046055-0ec9-40ec-8448-67afd9608fa0/volumes/kubernetes.io~csi/comfyui-outputs-pv/mount: no such file or directory, Failed to read /host/proc/mounts: open /host/proc/mounts: invalid argument 17m Warning FailedMount pod/comfyui-774b78d8c4-ss44v MountVolume.SetUp failed for volume "comfyui-inputs-pv" : rpc error: code = Internal desc = Could not mount "comfyui-inputs-112233445566-us-east-1" at "/var/lib/kubelet/pods/83046055-0ec9-40ec-8448-67afd9608fa0/volumes/kubernetes.io~csi/comfyui-inputs-pv/mount": Could not check if "/var/lib/kubelet/pods/83046055-0ec9-40ec-8448-67afd9608fa0/volumes/kubernetes.io~csi/comfyui-inputs-pv/mount" is a mount point: stat /var/lib/kubelet/pods/83046055-0ec9-40ec-8448-67afd9608fa0/volumes/kubernetes.io~csi/comfyui-inputs-pv/mount: no such file or directory, Failed to read /host/proc/mounts: open /host/proc/mounts: invalid argument 27m Normal Scheduled pod/comfyui-774b78d8c4-ss44v Successfully assigned default/comfyui-774b78d8c4-ss44v to ip-10-2-178-17.ec2.internal 27m Normal Nominated pod/comfyui-774b78d8c4-ss44v Pod should schedule on: nodeclaim/karpenter-nodepool-xlmf5 27m Normal DisruptionBlocked nodeclaim/karpenter-nodepool-ht9tg Cannot disrupt NodeClaim: Nominated for a pending pod 27m Normal DisruptionBlocked node/ip-10-2-178-17.ec2.internal Cannot disrupt Node: Nominated for a pending pod 27m Warning FailedDraining node/ip-10-2-166-48.ec2.internal Failed to drain node, 12 pods are waiting to be evicted 27m Normal DisruptionTerminating node/ip-10-2-166-48.ec2.internal Disrupting Node: Emptiness/Delete 27m Normal DisruptionTerminating nodeclaim/karpenter-nodepool-hbv2q Disrupting NodeClaim: Emptiness/Delete 26m Warning Failed pod/comfyui-774b78d8c4-n42vk Failed to pull image "112233445566.dkr.ecr.us-east-1.amazonaws.com/comfyui-images:latest": rpc error: code = Unavailable desc = error reading from server: EOF

And check the url https://github.com/awslabs/mountpoint-s3-csi-driver/issues/174, it show that is a bug for s3-csi-addon with karpenter. and I deploy the k8s-device-plugin plugin and restart s3-csi and then it resolve