aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.21k stars 320 forks source link

[ECS] [request]: Fargate EFS volume mount working example in CDK #1090

Open sky4git opened 4 years ago

sky4git commented 4 years ago

Community Note

Tell us about your request Provide a working example of EFS volume mount with 1.4.0 Fargate in which container running as non-root user can actually write data to EFS.

Which service(s) is this request for? Fargate, ECS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? My problem is mostly similar to #863 . I am running two containers in a task. Apache-alpine and PHP-fpm. I am not using bind-mount/mount points in my task definition. Dockerfile has a VOLUME instruction which maps to /usr/local/apache2/htdocs

My task is able to mount EFS volume successfully. Apache container runs on www-data user. Apache and PHP-FPM via proxy working fine. When apache wants to write/create a new file, it is able to do that. However, EFS volume size never increases from default 6 Kb. This tells me that all the new files are written to temporary disk space and not to EFS.

From the issue solution #863 I think defining the VOLUME in Dockerfile would actually be able to write files in EFS volume and apache would be able to read/write/execute files from there. Obviously, that is not happening.

I also thought, if my script creates a new file, that would go to EFS volume as it is mounted successfully with access point. It is not happening either.

I have gone through all the parts of this blog post: https://aws.amazon.com/blogs/containers/developers-guide-to-using-amazon-efs-with-amazon-ecs-and-aws-fargate-part-1/ = didn't work.

Not sure what I am missing.

So, can someone provide working example of simple apache webserver mount working with EFS + FARGATE. in CDK?

Are you currently working around this issue? How are you currently solving this problem? I have tried efs mount with access point, with IAM authorization + access point, without access point, with chown /usr/local/apache2/htdocs to www-data:www-data and without chowning it.

None of above has worked.

Additional context Anything else we should know? Securitygroups are fine as temporary EC2 instance created with the securitygroup is able to mount and write file to efs successfully. IAM permission to task roles are allowed for "elasticfilesystem:ClientWrite", "elasticfilesystem:ClientRootAccess", "elasticfilesystem:ClientMount",

EFS volume has created a /usr/local/apache2/htdocs which I can confirm, but there is no files into it as I mentioned.

Attachments If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

mreferre commented 4 years ago

This is not an answer to your question directly but I am wondering if you have seen this blog post and this blog post?

sky4git commented 4 years ago

I have followed that blog posts. My observations so far is:

Scenario 1: VOLUME path is declared in Dockerfile and mountpoint is declared in task definition to the same path In this case:

Scenario 2: VOLUME path is declared in Dockerfile and mountpoint is not declared in task definition in this case:

Expectations:

Files in Dockerfile declared VOLUME directory should come up in EFS volume. Files created by PHP script should also come up in EFS volume. Files created in entrypoint.sh should appear in EFS volume.

mreferre commented 4 years ago

Mh... I am slightly confused why you have mount points declared in Dockerfile. They are not required to allow ECS/Fargate to mount EFS volumes inside the tasks (also I would not know what the ramifications of adding that volume in both places would be). Is there a specific reason why you are adding the mount in the Dockerfile? At the high level the flow should be (summary of my blog):

This should bypass completely the POSIX permissions and mask the user in the container with the custom UID/GID you configure the AP for.

sky4git commented 4 years ago

Mountpoint is declared in the task definition not in Dockerfile. VOLUME is declared in Dockerfile.

Everything else is done as you have said. EFS is mounted and accessible as entrypoint.sh can create files in it.

Issue is, when we use mount point it overwrites or hides the VOLUME directory content. To write in the EFS volume, files must be created from entrypoint.sh.

Again expectation is:

Files in Dockerfile declared VOLUME directory should come up in the EFS volume. Either with mountpoint declared in task definition or without it. But, it seems like limitation of Fargate platform V1.4.0, while it was working 1.3.0

mreferre commented 4 years ago

Can I see your Dockerfile and your Task Definition file? You can either post them here or send them to me (mreferre at amazon dot com).

sky4git commented 4 years ago

@mreferre I have sent you in the email

YustinS commented 4 years ago

Can confirm running into the same issue, using the attached Dockerfile, and it really has stumped me for some time since the Docker documentation makes it clear a mount should have the existing data copied into it when the container starts

When spinning up locally it correctly is copying the data that was in the /app directory onto the created volume, both when explicitly passed the Volume details and when it does anonymous. (For testing you can expose 8080 and go to localhost:8080/admin to see it loads as expected, pure localhost:8080 will 503 on you). Dockerfile.basic.txt

I am using Terraform in my situation, however hopefully I can illustrate the mount as well: terraform-definition.txt

The expectation would be that, similar to locally working on this, the data that was in the mounted directory should be copied into the EFS mount, but instead it clears everything and makes the application 404 as there is no longer any content for it to load. If that option isn't supported it would be good to have it clarified in the documentation, as as was mentioned this goes against what Dockers documentation claims (https://docs.docker.com/engine/reference/builder/#volume)

(Also yes for what it is worth this is a bit of a toy example and there are workarounds, however its just surprising the way EFS, Fargate and Docker are interacting)

mreferre commented 4 years ago

Sorry for the delay here. We were discussing this internally. Dockerfiles and IaC files from both @YustinS (attached below) annd @sky4git (provided offline) have a common pattern which I will try to describe below.

I won't go into all the tech details here but essentially what's happening behind the scenes is that this setup will try to mount two separate volumes (one empty and initialized by the Task and one with content initialized by the container inside the task) to the very same mount point /my/folder.

This is not a scenario we have anticipated with the EFS/ECS integration (and I am not frankly sure it would make sense to use as an hydration mechanism for EFS). I am no Wordpress guru but, speaking to people that are more experienced on that front, it looks like a proper pattern would be to decouple the core WP code that could live with the container (and be lifecycled with the container) from the content of WP (e.g. wp-content) that could be mapped to the EFS volume. What remains to be determined is how to hydrate the EFS content for the first setup (probably a pipeline that includes populating the EFS volume with wp-content content at WP setup time would be the best route?).

mreferre commented 4 years ago

If we can settle on what a proper architecture would look like for a scenario like this we can then provide an example re how to setup that scenario.

YustinS commented 4 years ago

Thanks @mreferre.

I think you summarize it very well, and to a degree you are correct, if we can decouple the data that preexists from the directories that will be filled with data as the application runs it simplifies things, which I have managed in a secondary situation using the EFS mount to Fargate. That is best practice anyway, but I'm sure we can all relate to how that ends up reflecting reality.

Unfortunately, as my example somewhat illustrates, some containers do rely on preexisting configurations that are mounted through into the volume (i.e initial config files), and that is a harder thing to solve, as the current work around is to either create some logic so the Docker container populates the EFS volume if it is empty (which can introduce issues around startup order), or as mentioned somehow create the EFS volume, populate that, and then spin up the Fargate instance.

Both those options unfortunately do add a lot of complexity, so for the time being it seems like using EC2 based ECS tasks will be better suited, as when mounting the EFS through the EC2 instance and then using host mappings into ECS the expected behaviour exists, i.e populating the volume based on data that was inside the container image (and I can validate this, I have a whole Dockerized Jenkins environment that uses this behaviour).

UPDATE: So I think I finally track this back to why I was confused, and I'll mention it for those passing through. The term volume is very specific inside of Docker, particularly as mentioned above inside of a Dockerfile including a VOLUME command has a very specific outcome, essentially to force that to always have an external source holding the data there (either a volume or a mount). If you don't specifically mount a volume to that Volume directory, a anonymous volume will be created and the data on that path will be copied into it (obviously no persistent as it is keep local).

The problem seems to arise around the use of Volume at runtime. This is where some funny stuff comes up around binds and mounts, which is quite a fun rabbit hole to fall down, but I'll leave that for others to discover. Now, as per https://docs.docker.com/storage/#tips-for-using-bind-mounts-or-volumes, Volumes if they are used should, if empty, propagate the data into the newly created Volume.

So it appears that my issue is a disconnect between the usage of the term "EFS Volume" and "Docker Volume". This means for whatever reason how the Volume is being attached to FARGATE is causing it to not trigger the empty volume action instead acting like it is non-empty or a bind mount, causing it to mask the directory that it is mounted to, which leads to the above discussion.

This is great for more general purpose where the data is put into the EFS Volume after the container initializes (such as the above mentioned WordPress case), but doesn't work in situations where the container is expecting content at the VOLUME location (and by virtue of that copied to a Volume) to be there for the container to run, such as the one I provided with Craft with needs a number of the files pre-seeded, but also needs to be persisted.

mreferre commented 4 years ago

This is where some funny stuff comes up around binds and mounts, which is quite a fun rabbit hole to fall down

I couldn't agree more :)

On a more serious note, we are discussing this issue internally.

mreferre commented 4 years ago

@YustinS one thing I am still trying to get my head around is how you decouple (in your current workflow):

This is great for more general purpose where the data is put into the EFS Volume after the container initializes (such as the >above mentioned WordPress case), but doesn't work in situations where the container is expecting content at the VOLUME >location (and by virtue of that copied to a Volume) to be there for the container to run, such as the one I provided with Craft >with needs a number of the files pre-seeded, but also needs to be persisted.

How do you deal with the fact that 1) they need to be there before the container starts but 2) they need to persist (assuming across container re-deployments)?

Do you have some sort of if-then-else logic somewhere that says "if the bits are already there do nothing otherwise create them"? I am asking because this is EXACTLY the same approach I have used in my fake app in this blog post where I do this:

CONFIG_FILE="/server/config.json"

# this generates a unique ID
RANDOM_ID=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1)

if [ ! -f "$CONFIG_FILE" ]; then echo $RANDOM_ID > /server/config.json; fi

Basically this means that when the application starts for the first time it checks on EFS (mounted on /server) if the file exists, if it doesn't it means it's the first deployment whereas if it exists it means the app has already been initialized and the config can't be re-created.

This is an (admittedly stupid) example of something that needs to be decoupled from from the container life-cycle (i.e. because you can have many of them, because they restarts, because they get updated etc etc) and needs to be persisted (i.e. because if I lose my config files I have lost my (fake) application).

I am wondering how an approach that auto-hydrate this content at every container start (like the Docker volume feature we have discussed) would work if the content isn't ephemeral, needs to persist and can't be re-created from scratch every time a container spins up.

The easiest way I thought that this could be implemented was a setup workflow that includes a step to populate the EFS share "once" and then never runs again (there will only be a workflow/pipeline that only updates the code/containers which are going to be completely decoupled by this persistent/shared/durable content. It could perhaps be a CDK construct that does this? Or a step in a setup pipeline? TBD.

The other option would be to include the if-then-else logic I had in my fake app so that only the first container ever to start would populated the volume and that branch of the program would never run again once it's run the first time (but if you are doing this for everything that isn't a fake app like mine it would be complex and very expensive for what it delivers).

All this really to ask how the Docker volume hydration feature would work in a scenario like this? We may try to explore a way that would allow to "sync your EFS volume on start with the content of your Docker Volume" but that would end up running every time and not just "once" (as it should).

Thoughts?

YustinS commented 4 years ago

I'll be straight up, I've yet to figure it out, at least in a way that would suit programmatic deployments such as Terraform or the CDK. I've tried a few things, such as using a "seed" task to get the data onto EFS and then fire up the service, but none of them have really worked in a way that didn't end up essentially replicating images, or leave other issues such as ownership being misconfigured. I also tried some IF/ELSE logic, but that quickly spiraled to an tangled mess, and also starts running down the path of having to fight race conditions, or for the other tuning timeouts to get the needed data all hydrated and chowned correctly.

Its something I really want to keep poking out, as I feel like there has to be something to really work off of, but the fallback I can see (and yet to validate) is to use EC2 based ECS with EFS mounting through as a full bind, and have the UserData do the hydration step for me (either spinning up the container and pull the data directly or using an S3 bucket to simply house it in the interim), since adding too much complexity to the Container seems like defeating the purpose. It definitely wouldn't be pretty, but by virtue of the way UserData operates it at least wouldn't allow launching of tasks until it was completed (in theory).

sky4git commented 4 years ago

I have tried configuring WordPress with Git and Code pipeline. The Apache alpine container in the Docker image have the copy of git WordPress repo in /usr/local/apache2/htdocs folder. I also have PHP-fpm container which needs to know where the PHP files are so it can execute.

I've tired the following;

Scenario 1:

  1. Docker image build with WordPress code in /usr/local/apache2/htdocs folder
  2. EFS volume with access point "/uploads" directory path
  3. Container mountPoint on "/usr/local/apache2/htdocs/wp-content/uploads" folder

Expectation:

  1. Uploads directory will hold media files which is uploaded in WordPress.

Result:

  1. Uploads directory never becomes writable for WordPress to upload its document even if access point permission is "777".

Scenario 2:

  1. Docker image build with WordPress code in /usr/local/apache2/htdocs folder
  2. EFS volume with access point "/usr/local/apache2/htdocs/wp-content/uploads" directory path
  3. Container mountPoint on "/usr/local/apache2/htdocs/wp-content/uploads" folder

Expectation:

  1. Exact path might work on access point

Result:

  1. Access point cant have more than 4 levels down the directory path. It is not allowed, or limitation

Scenario 3: (still trying)

  1. Docker image build with just apache
  2. EFS volume has to be populated from either from Codebuild OR
  3. Container start script check whether it is running the latest committed code, if not then
  4. Replace the EFS volume code from container start script with new code based on checking git commit id

Result:

  1. Container script takes very long to replace the code and never finished the whole process. I do not consider this implementation can ever be worked perfectly thus cannot be used in production environment.
  2. Codebuild only works with EFS if it is mounted through private subnet, while in my scenario both container and EFS is on public subnets.

Recommendations:

1 EFS volume just need to work like Docker volume as per the Fargate version 1.3.0.

  1. Customer do not need to define the access point in their CloudFormation template when using Fargate 1.4.0, it should be taken care by Fargate platform. Thus, it doesn't create permission issues while working with EFS volume.
  2. Let VOLUME declaration supersede the mountpoint declaration. I think in most cases, when user declared the VOLUME, it expect that content will be available on VOLUME path regardless mountpoint declared or not.
mreferre commented 4 years ago

@YustinS Thanks.

...but the fallback I can see (and yet to validate) is to use EC2 based ECS with EFS mounting through as a full bind, and have the UserData do the hydration step for me (either spinning up the container and pull the data directly or using an S3 bucket to simply house it in the interim)

Can you drive me through the workflow there? E.g. mount EFS vol on EC2 instance, mount container directory as a bind-mount, create logic to hydrate if EFS vol is empty, otherwise use what's on the volume?

@sky4git one more question. Your workflow makes sense at first startup of the container (i.e. it populates/hydrate the EFS volume). What is your expectation for the content of that shared EFS directory when you life cycle your containers (i.e. when they just restart or you deploy new version of the image)? I assume part of that EFS folder will have now changed since the first deployment (i.e. wp-content hosts new files that have been uploaded at run-time as you use WordPress?). Do you expect the container content to be obfuscated by the EFS volume content? Or do you want to nuke again the EFS content and copy back the container content? (I assume the former but I am asking)

sky4git commented 4 years ago

Case 1:

In my workflow, when using WordPress with Git, I create a new image of the container on code commit. In this case, I only want to mount EFS to wp-content/uploads directory. With the WordPress, wp-content directory still have other subdirectories such as /themes /plugins /mu-plugins These directories are still part of container image. only /uploads directory which holds the uploaded media files mounted with EFS. However though it sounds simple, even with access point with permission "777" /uploads folder never became writable for my container. Not sure why.
So expectation is, mounted EFS directory needs to be writable by container scripts (even if it runs as non-root user). In this case, PHP. (May be I need a tutorial)

Case 2:

Second case, when all the code lives on EFS volume. In this case, when Git commit happens, I will need to replace the existing content on EFS volume with new content from Git commit. This means, all the content of EFS volume first needs to be deleted and then populated with the new content. This workflow is problematic. Especially, deleting and repopulating takes quite long. In my case, I tried to use rsync but it takes around 7 mins through CodeBuild EFS mount. With container entrypoint script, it will take even longer, I guess, if container size is small. Because I can't make case 1 work, case 2 is the only option for me now.

In an git world scenario, EFS volume only needs to hold non committable data. Repopulating EFS volume on each git commit is not suitable.
In non-git world, populating EFS volume ideally happens through the VOLUME instruction of Dockerfile, not through entrypoint script. Entrypoint script population is a workaround.

I hope my explanation helps.

YustinS commented 4 years ago

Hi @mreferre, essentially what I have reverted to is a few steps inside of the UserData to achieve the goal I required. In the situation I provided I was using the Dockerfile to generate the data into the Volume (which was then masked and not usable), so fortunately I could deconstruct the original Dockerfile a bit and move some logic over to the UserData. Hopefully this example is useful.

user-data.txt

Essentially, when the server initially boots we go through the regular motions of checking for the EFS mount (using regular NFS mounting), then we have a check to ensure that the data is hydrated onto the EFS mount or not. Once this is checked and the UserData is completed when the container starts we simply use bind mounting to get that EFS mount passed through to the container, which can then use the data that was seeded. The other advantage then of course is we can manipulate the data if it needs any changes.

By no measure would I call this work around pretty, but in theory the same thing could be done to hydrate a volume if we were using the same image by bind mounting an alternate directory in the container (say /mnt for arguments sake), and then using a command to move data from the intended volume (/app) to the alternate directory (/mnt). Then, when ECS fires up the task it should mount the container to the normal volume location (/app) and the data should be there. In that case the command becomes like follows (unvalidated of course) if we use the same logic as the attached UserData:

docker run -v ${efs_mountpoint}:/mnt --entrypoint="sh" container cp -a /app /mnt

rkajbaf commented 3 years ago

I have a similar issue as @sky4git and can't for the life of me, make my EFS volume writable. For context, I am trying to mount the EFS volume to a Fargate task containing a single docker image atmoz/sftp which is a simple SFTP server.

Locally, works as expected but on AWS using a task definition, I can only read my volume.

EDIT: It appears as though if you ever create a root folder with bad permissions when creating your access point, the only recourse I found was to trash the whole EFS volume and start over which worked.

BahodurSaidov commented 3 years ago

Any updates ? If such problem still exists then I don't see any proper benefits of EFS as persistent storage ...