Closed AlexandreBrown closed 1 year ago
Hi AlexandreBrown, does your oidc provider have the alpha.eksctl.io/cluster-name tag and is there anything interesting that you see in the efs csi driver logs or on cloudtrail(related to efs). We also recommend following the manual steps for terraform.
@ryansteakley My OIDC provider (created via terraform deployment I suppose) has the following tag :
I added the tag :
But it did not change anything :
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitForFirstConsumer 41s persistentvolume-controller waiting for first consumer to be created before binding
Normal ExternalProvisioning 13s (x3 over 39s) persistentvolume-controller waiting for a volume to be created, either by external provisioner "efs.csi.aws.com" or manually created by system administrator
We also recommend following the manual steps for terraform.
I modified the auto script to only keep the parts that create the file system (steps that matches the manual steps).
I'm not sure why that would not work.
import argparse
import boto3
import subprocess
import string
import random
import yaml
from shutil import which
from time import sleep
def main():
header()
verify_prerequisites()
setup_efs_file_system()
setup_efs_provisioning()
footer()
...
@ryansteakley From my comprehension, the doc says we have to skip the entire step 1. (so step 1.1 and 1.2) since the text is below 1.
Is this correct or did it meant to say only skip 1.1?
@ryansteakley After further testing it looks like the only way I could get EFS to work was to use the auto script (no manual steps and no skipping of the CSI driver install).
Maybe the driver installed by terraform is not being used or detected? It works with the auto script (untouched from the repo) but it does not work when I do all the steps but the driver install.
The following worked (snippet of my dockerfile):
RUN OIDC_ID=$(aws eks describe-cluster --name $CLUSTER_NAME --query "cluster.identity.oidc.issuer" --output text | cut -d "/" -f5) \
&& AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text) \
&& aws iam tag-open-id-connect-provider \
--open-id-connect-provider-arn "arn:aws:iam::$AWS_ACCOUNT_ID:oidc-provider/oidc.eks.$CLUSTER_REGION.amazonaws.com/id/$OIDC_ID" \
--tags Key="alpha.eksctl.io/cluster-name",Value="${CLUSTER_NAME}" \
&& python utils/auto-efs-setup.py \
--region $CLUSTER_REGION \
--cluster $CLUSTER_NAME \
--efs_file_system_name $EFS_FILE_SYSTEM_NAME \
--efs_security_group_name $EFS_SECURITY_GROUP_NAME \
--efs_throughput_mode elastic \
&& kubectl patch storageclass gp2 -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}' \
&& kubectl patch storageclass efs-sc -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
Describe the bug If we follow the doc and use the manual step (or in my case I modified the auto efs script to only install the file system and create the storageclass), EFS creation succeeds but when creating a test notebook the volume is in pending state forever.
Steps To Reproduce Deploy EFS using the auto setup script trimmed to the equivalent of the manual steps for terraform deployment :
Environment
1.25
1.25
v1.7.0
v1.7.0-aws-b1.0.0
coginito-rds-s3
Screenshots