Closed JoeTice closed 1 year ago
An EFS (Elastic File System) Hosted Persistent Volume refers to a network-attached storage (NAS) solution provided by Amazon Web Services (AWS) for use with containerized workloads running on their Kubernetes-based platform, Amazon Elastic Kubernetes Service (EKS). A persistent volume is a storage resource that can outlive the life of a container or pod, providing a way to retain and share data between different containers, even if they are running on separate instances.
EFS is a managed file storage service that can automatically scale up and down according to the needs of the applications using it, providing high performance and durability. In the context of Kubernetes, an EFS Hosted Persistent Volume can be utilized as a backing store for Kubernetes Persistent Volumes (PV) and Persistent Volume Claims (PVC), enabling the sharing of data between multiple pods or containers across different nodes in a Kubernetes cluster.
To set up an EFS Hosted Persistent Volume, you would typically follow these steps:
Create an Amazon EFS file system in your AWS account. Configure the necessary IAM roles and security groups to allow the EKS worker nodes to access the EFS file system. Create a Kubernetes StorageClass that uses the EFS CSI (Container Storage Interface) driver, which enables EFS integration with Kubernetes. Create a Kubernetes Persistent Volume (PV) that references the EFS file system and the StorageClass you created in step 3. Create a Kubernetes Persistent Volume Claim (PVC) that binds to the Persistent Volume created in step 4. Finally, deploy your application with a pod specification that includes the PVC as a mounted volume. EFS Hosted Persistent Volumes provide a flexible and scalable storage solution for containerized workloads on AWS, making it an attractive choice for various use cases, such as content management systems, data analytics, and machine learning applications.
https://console.amazonaws-us-gov.com/efs/home?region=us-gov-west-1#/get-started
Task:
I tried creating an EFS but got the following error:
User: arn:aws-us-gov:iam::008577686731:user/Holden.Hinkle is not authorized to perform: elasticfilesystem:TagResource on the specified resource.
I filed a support request - https://dsva.slack.com/archives/CBU0KDSB1/p1682708370672949
kubectl apply -k "github.com/kubernetes-sigs/aws-efs-csi-driver/deploy/kubernetes/overlays/stable/?ref=release-1.0"
URL: https://github.com/kubernetes-sigs/aws-efs-csi-driver/tree/master/deploy/kubernetes/overlays/stable
efs-sc.yaml
with the following content:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: efs-sc
provisioner: efs.csi.aws.com
Related example: https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/master/examples/kubernetes/volume_path/specs/example.yaml
Apply the StorageClass:
kubectl apply -f efs-sc.yaml
apply
command docs - https://jamesdefabia.github.io/docs/user-guide/kubectl/kubectl_apply/
efs-pvc.yaml
with the following content (replace apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: efs-pvc
spec:
accessModes:
- ReadWriteMany
storageClassName: efs-sc
resources:
requests:
storage: 5Gi
volumeHandle: <your_efs_file_system_id>::/
Apply the PVC:
kubectl apply -f efs-pvc.yaml
Task:
volumes:
- name: efs-volume
persistentVolumeClaim:
claimName: efs-pvc
b. Mount the volume to your application container by adding a volumeMounts entry in the spec.template.spec.containers section:
volumeMounts:
- name: efs-volume
mountPath: /efs
Here's the updated Dockerfile
- https://github.com/department-of-veterans-affairs/vets-website/blob/main/src/platform/utilities/preview-environment/Dockerfile
FROM public.ecr.aws/bitnami/node:14.15.5
# Install NFS utilities for mounting EFS volumes
RUN apt-get update && apt-get install -y nfs-common
RUN mkdir -p /website
WORKDIR /website
# Create a directory for EFS volume mount
RUN mkdir -p /efs
# Clone vagov-content
RUN git clone --depth 1 https://github.com/department-of-veterans-affairs/vagov-content
# Clone vets-gov-json schema
RUN git clone --depth 1 https://github.com/department-of-veterans-affairs/vets-json-schema.git
# Clone veteran-facing-services-tools
RUN git clone --depth 1 https://github.com/department-of-veterans-affairs/veteran-facing-services-tools
# Clone content-build
RUN git clone --depth 1 https://github.com/department-of-veterans-affairs/content-build.git
# Setup a working directory for vets-website
RUN mkdir -p /website/vets-website
WORKDIR /website/vets-website
# Copy vets-website files into Docker image
COPY . .
# Copy startup script into place
COPY src/platform/utilities/preview-environment/start.sh .
RUN chmod +x start.sh
# Expose ports
EXPOSE 3001
EXPOSE 3002
ARG AWS_URL
ENV AWS_URL $AWS_URL
# Configure image to execute a script on startup with ENTRYPOINT/CMD
ENTRYPOINT ["./start.sh"]
CMD ["$AWS_URL"]
Task:
If I'm understanding this question correctly, we can just create a GitHub Action that would run every time the main branch of the content build repo is updated, run the build process, and then store the build in an EFS instance using whatever file directory or filename naming convention we decide on.
Task:
To mount a specific subdirectory from an EFS volume containing different versions of the content-build, you can modify the Kubernetes deployment YAML file and use the subPath field in the volumeMounts section. Here's a step-by-step process:
Ensure that your EFS volume contains the different versions of the content-build in separate subdirectories. For example, /efs/v1
, /efs/v2
, etc.
Update the Kubernetes deployment YAML file for your application:
a. Add a volume entry in the spec.template.spec.volumes
section, referencing the previously created PVC:
volumes:
- name: efs-volume
persistentVolumeClaim:
claimName: efs-pvc
b. Mount the desired subdirectory from the volume to your application container by adding a volumeMounts
entry in the spec.template.spec.containers
section. Use the subPath
field to specify the subdirectory that contains the desired version of the content-build:
volumeMounts:
- name: efs-volume
mountPath: /content-build
subPath: v1
Replace v1
in subPath: v1
with the desired version's subdirectory name.
Modify your application code, configuration files, or environment variables to reference the /content-build
directory instead of the previous content-build
directory. This way, your application will use the desired version of the content-build
from the mounted EFS subdirectory.
If you need to switch between different versions of the content-build at runtime, you can use environment variables or ConfigMaps in your Kubernetes deployment to dynamically set the subPath
value. This would involve using a template engine or pre-processing the deployment YAML file before applying it.
Please note that this approach assumes you have a separate Kubernetes deployment for each desired version of the content-build. Alternatively, you could use a single deployment with multiple replicas, where each replica mounts a different version of the content-build. However, this might require additional logic in your application to manage and route traffic between the different versions.
By using the subPath field in your Kubernetes deployment, you can mount specific subdirectories from your EFS volume and effectively switch between different versions of the content-build.
Elaborating on Step 4:
In step 4, I mentioned using environment variables or ConfigMaps to dynamically set the subPath
value in your Kubernetes deployment. This would allow you to easily switch between different versions of the content-build at runtime. I'll provide two examples: one using environment variables and the other using ConfigMaps.
Using environment variables
In your Kubernetes deployment YAML file, define an environment variable for your application container that holds the desired content-build version:
env:
- name: CONTENT_BUILD_VERSION
value: "v1"
Use a template engine like Kustomize or Helm to process your deployment YAML file before applying it. These tools allow you to replace variables in the YAML file with actual values during deployment.
In your deployment YAML file, replace the subPath field value with a placeholder that represents the environment variable:
subPath: ${CONTENT_BUILD_VERSION}
When deploying your application, use the template engine to replace the placeholder with the actual value of the environment variable. This way, you can easily switch between content-build versions by changing the environment variable value and re-deploying your application.
Using ConfigMaps
Create a ConfigMap that contains the desired content-build version as a key-value pair:
apiVersion: v1
kind: ConfigMap
metadata:
name: content-build-version
data:
version: "v1"
In your Kubernetes deployment YAML file, reference the ConfigMap value as an environment variable for your application container:
env:
- name: CONTENT_BUILD_VERSION
valueFrom:
configMapKeyRef:
name: content-build-version
key: version
Similar to the environment variable approach, use a template engine to process your deployment YAML file and replace the subPath field value with a placeholder that represents the environment variable:
subPath: ${CONTENT_BUILD_VERSION}
When deploying your application, use the template engine to replace the placeholder with the actual value from the ConfigMap. To switch between content-build versions, simply update the ConfigMap value and re-deploy your application.
Both of these methods allow you to dynamically set the subPath value in your Kubernetes deployment to mount specific subdirectories from your EFS volume containing different versions of the content-build. By changing the environment variable or ConfigMap value, you can switch between versions without modifying the deployment YAML file directly.
Task:
We can create a GitHub Actions that would would run once a day and delete directories/files that are older than, say, 30 days.
**
A couple of other ways to do this:
Amazon EFS does not provide built-in functionality to automatically delete files or directories based on their age. To achieve this, you'll need to implement a custom solution, such as running a cron job on an EC2 instance or using a scheduled AWS Lambda function to periodically clean up old files and directories.
Using a cron job on an EC2 instance
#!/bin/bash
find /path/to/efs-mount-point -type f -mtime +30 -exec rm -f {} \;
find /path/to/efs-mount-point -type d -empty -delete
Replace /path/to/efs-mount-point
with the actual mount point of your EFS volume on the EC2 instance.
Make the script executable: chmod +x cleanup.sh
Schedule the script to run periodically (e.g., daily) using cron:
crontab -e
Add the following line to the cron table:
0 0 * * * /path/to/cleanup.sh
This will execute the script every day at midnight.
Using a scheduled AWS Lambda function
Create a Lambda function using a runtime like Python or Node.js.
Use the AWS SDK to interact with your EFS file system. You'll need to configure your Lambda function to access the EFS file system by creating an EFS access point and mounting it to the Lambda function.
Implement a script in your Lambda function that scans the EFS file system, identifies files and directories older than 30 days, and deletes them.
Create an Amazon EventBridge (formerly CloudWatch Events) rule to trigger your Lambda function periodically (e.g., daily).
Either of these methods can help you automatically delete files and directories older than 30 days in your EFS file system. The choice depends on your preference, operational requirements, and the environment in which your EFS is being used.
You can create your own directories and store JSON blobs or other files within those directories when using Amazon EFS. EFS behaves like a network file system (NFS) that can be mounted to multiple instances or containers simultaneously, allowing you to read and write files as you would on any other file system.
To create directories and store JSON blobs in each directory, follow these steps:
Mount the EFS volume to your instance or container, as described in the previous answers.
Once the EFS volume is mounted, you can use standard file system commands and programming libraries to create directories, read, write, and delete files. For example, if you mounted the EFS volume at /efs
, you could create a directory called my_directory
and store a JSON
blob as follows:
mkdir /efs/my_directory
echo '{ "key": "value" }' > /efs/my_directory/my_blob.json
You can also interact with the EFS volume programmatically using your preferred programming language's file I/O libraries, such as Node.js's fs
module.
The EFS volume is shared across all instances or containers that have it mounted, so changes made to the files and directories will be visible to all of them. This can be beneficial for sharing configurations or data across your application, but you should also be aware of potential concurrency issues when multiple processes are accessing and modifying the same files simultaneously.
Thanks for looking into all these aspects. I haven't digested all the analysis yet, but I think this goes a long way to helping us understand the capabilities/limitations and in being able to describe the implementation details for each aspect of what we need. Thanks!
Description
The objective of this discovery phase is to research and analyze the requirements and technical aspects needed to create an EFS hosted persistent volume for preview environments. This will involve understanding the process of implementing an EFS volume in EKS, identifying the changes required in the Docker image and vets-website configuration, and evaluating the benefits of utilizing a persistent volume, such as reduced startup time and supporting tailored static content in preview environments.
Tasks
vagov-staging
VPC where preview environments live.content-build
when they become too staleAcceptance Criteria
content-build
when they are too stale has been explored and documented.