Open hanselblack opened 3 months ago
I am not able to reproduce this with basic mounting on Bottlerocket. Any more logs or information about your configuration will be helpful. I'm interested on how you are actually deploying this and the timing between events. Given that the mount does succeed and is functional, it seems like this could just be a timing issue if the pv is trying to mount while the driver is still coming up, but that is speculation.
apiVersion: v1
kind: PersistentVolume
metadata:
name: xxx-pv
namespace: default
spec:
capacity:
storage: 1200Gi
accessModes:
- ReadWriteMany
mountOptions:
- allow-overwrite
- region ap-southeast-1
- max-threads 16
csi:
driver: s3.csi.aws.com
volumeHandle: s3-csi-driver-volume-output
volumeAttributes:
bucketName: xxx
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: xxx-pvc
namespace: default
spec:
accessModes:
- ReadWriteMany
storageClassName: ""
resources:
requests:
storage: 1200Gi
volumeName: xxx-pv
---
apiVersion: batch/v1
kind: Job
metadata:
name: xxx-job
namespace: default
spec:
template:
metadata:
labels:
app: xxx-job
spec:
nodeSelector:
type: gpu
containers:
- name: xxx
image: # AWS ECR image URI
imagePullPolicy: Always
command: ["/bin/sh", "-c"]
args:
- cp -r /tmp/mount/xxx /usr/src/app/;
resources:
limits:
memory: 10000Mi
nvidia.com/gpu: 1
requests:
memory: 10000Mi
cpu: 4000m
nvidia.com/gpu: 1
volumeMounts:
- name: persistent-storage-data
mountPath: /tmp/mount
volumes:
- name: persistent-storage-data
persistentVolumeClaim:
claimName: xxx-pvc
The above is the manifest for the deployment.
The nodes are scaled up through Karpenter, using spec.amiFamily Bottlerocket
runs with GPU.
The driver is installed by EKS addon, and the kube-system
name-space is on fargate-profile.
Yeah it could be timing issue. Oddly, dint have this issue on AL2.
/kind bug What happened? When using the Bottlerocket AMI with Karpenter NodeClass. Describing the pod, the events shows:
This error does not appear in when using AL2 AMI. However, even with the warning, I am still able to read data from the S3 mountpoint.
What you expected to happen? No warnings messages.
How to reproduce it (as minimally and precisely as possible)?
Anything else we need to know?:
Environment
kubectl version
): v1.28