Closed petderek closed 4 years ago
There's two ways we handle EFS tasks:
Agent has logic to fallback to the local driver if the plugin isn't bundled with ecs-init. We may need to update this to also check that the plugin is observable by docker. In other words:
if plugin_available:
use_plugin
if plugin_available and plugin_registered_with_docker:
use_plugin
Behavior may be subtly different in upstart vs systemd -- so we'll need to investigate this further.
I think the problem is that the agent is checking this env and uses the volume plugin if it contains "efsAuth", which ecs-init always specifies regardless of whether the volume plugin service is enabled to start by default.
Can this be fixed by embedding a config on the ami and only specify "efsAuth" when that config is true, similar to what has been done for gpu support https://github.com/aws/amazon-ecs-init/blob/6316c16de28dd7236fb520d307bd6086fc8b64a5/ecs-init/docker/docker.go#L312?
This issue looks to be isolated to EFS preview customers. The GA version of EFS was released in version 20200319 (v1.38.0) of the ECS-optimized AMIs.
Customers who are using an ECS-optimized AMI earlier than version 20200319 (v1.38.0) should upgrade the AMI version to 20200319 (v1.38.0) or later to take advantage of the configuration required by GA EFS support on ECS, which includes enabling the amazon-ecs-volume-plugin by default. For more information, see Amazon ECS Optimized AMI Versions.
I'm testing the upgrade path for customers using AL2 20191212 (v1.35.0) ECS Optimized with EFS preview. This should work for AL2 AMIs 20191212 (v1.35.0) and later
First we need to enable Docker and via extras and update:
sudo amazon-linux-extras enable docker
yum clean metadata
sudo yum install docker
This will update docker to v19.03.6-ce Next we'll do:
sudo yum update
This will will install the latest agent 1.39.0, but we need to restart agent in order for this to be enabled:
sudo systemctl stop ecs
sudo systemctl start ecs
Next we need to install amazon-efs-utils (note this package is installed by default for AMIs 1.36.0 and later):
sudo yum install amazon-efs-utils
And we need to enable the plugin:
sudo systemctl enable --now amazon-ecs-volume-plugin
Finally, we'll restart Docker:
sudo systemctl restart docker
You can test the install with an EFS filesystem in the same subnet as your upgraded instance:
docker volume create --name <volume_name> -d amazon-ecs-volume-plugin --opt o=tls,ro --opt type=efs --opt device=<efs_filesystem_id>
This should work for AL1 AMIs 20191212 (v1.35.0) and later
First we'll do a yum update
which will upgrade Docker to v19.03.6-ce and the agent version to 1.39.0:
sudo yum update
Next we need to install amazon-efs-utils (note this package is installed by default for AMIs 1.36.0 and later):
sudo yum install amazon-efs-utils
Next we'll need to reboot the instance so the plugin can be started up via upstart's init:
sudo shutdown -r now
After the instance reboot, ou can test the install with an EFS filesystem in the same subnet as your upgraded instance:
docker volume create --name <volume_name> -d amazon-ecs-volume-plugin --opt o=tls,ro --opt type=efs --opt device=<efs_filesystem_id>
the AWS docs have been updated to reflect simplified versions of the above instructions:
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/efs-volumes.html
Summary
Customers using Preview support for ECS can experience issues with the amazon-ecs-volume-plugin not starting up when they perform agent upgrade.
Description
It looks like this can happen with the following steps:
Suggestions
If you're observing this, here are some things you can try to remedy the situation:
systemctl enable --now amazon-ecs-volume-plugin
)