Open mreferre opened 3 years ago
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Not to be closed
This issue has been automatically marked as not stale anymore due to the recent activity.
I am also experiencing this problem when I run the docker compose up command from this step of the AWS Docker Workshop Tutorial
https://docker.awsworkshop.io/31_docker_ecs_integration/10_migrate_to_ecs.html
My docker cli version (running on Amazon Linux 2 - NOT Ubuntu as the workshop recommends) :
$ docker version
Client:
Cloud integration: 1.0.17
Version: 20.10.7
API version: 1.41
Go version: go1.15.14
Git commit: f0df350
Built: Tue Sep 28 19:55:50 2021
OS/Arch: linux/amd64
Context: ecs
Experimental: true
Server:
Engine:
Version: 20.10.7
API version: 1.41 (minimum version 1.12)
Go version: go1.15.14
Git commit: b0f5bc3
Built: Tue Sep 28 19:56:28 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.6
GitCommit: d71fcd7d8303cbf684402823e425e9dd2e99285d
runc:
Version: 1.0.0
GitCommit: 84113eef6fc27af1b01b3181f31bbaf708715301
docker-init:
Version: 0.19.0
GitCommit: de40ad0
I've same issue, I run same your version but into Ubuntu
$ docker version
Client: Docker Engine - Community
Cloud integration: 1.0.17
Version: 20.10.9
API version: 1.41
Go version: go1.16.8
Git commit: c2ea9bc
Built: Mon Oct 4 16:08:29 2021
OS/Arch: linux/amd64
Context: myecs
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 20.10.9
API version: 1.41 (minimum version 1.12)
Go version: go1.16.8
Git commit: 79ea9d3
Built: Mon Oct 4 16:06:37 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.11
GitCommit: 5b46e404f6b9f661a205e28d59c982d3634148f8
runc:
Version: 1.0.2
GitCommit: v1.0.2-0-g52b36a2
docker-init:
Version: 0.19.0
GitCommit: de40ad0
So, the difference is operating system?
UPDATE Same issue running docker on AMI Linux 2, it appears that cloudformation does not create the security group for EFS https://docs.aws.amazon.com/AmazonECS/latest/developerguide/tutorial-efs-volumes.html#efs-security-group
I have never used this on a linux host and I have only used it from Mac (using Docker Desktop). I doubt the problem is the Linux version (because I have seen it occurring on MacOS, albeit randomly).
So on windows or mac it works?
UPDATE Yes, on Windows work!
I am on a Mac and I have seen this error message randomly 3 or 4 times out of dozens (hundreds?) of docker compose up
.
I see the same error using AWS CloudFormation in the "AWS::ECS::TaskDefinition" resource template.
I've just experienced this when trying to use it within WSL2 (Ubuntu 20.04) on Windows 10. So far, I've run docker compose up
twice and it has failed twice.
My compose file is:
version: "3"
x-aws-vpc: "vpc-xxxxxxxx"
services:
sonarqube:
image: sonarqube:9.1-community
depends_on:
- db
volumes:
- sonarqube_data:/opt/sonarqube/data
- sonarqube_extensions:/opt/sonarqube/extensions
- sonarqube_logs:/opt/sonarqube/logs
ports:
- target: 9000
x-aws-protocol: http
db:
image: postgres:12
volumes:
- postgresql:/var/lib/postgresql
- postgresql_data:/var/lib/postgresql/data
volumes:
sonarqube_data:
sonarqube_extensions:
sonarqube_logs:
postgresql:
postgresql_data:
The error I'm getting is:
DbService TaskFailedToStart: ResourceInitializationError: failed to invoke EFS utils commands to set up EFS volumes: stderr: b'mount.nfs4: Connection reset by peer' : unsuccessful EFS utils command execution; code: 32
I've found this forum posting: https://forums.aws.amazon.com/thread.jspa?threadID=321135, and I'm wondering if I need to do any configuration to my VPC or similar in order to allow EFS to work?
This is the first time I've tried using docker compose
with ECS so I'm not sure how much it should be doing completely automatically, which is why I'm adding to this existing issue. Happy to go elsewhere if this isn't the same problem.
I experience this issue as well during experimenting with docker compose and ECS.
What's frustrating is that things would be working fine for a while and then suddenly this issue would appear seemingly at random. To resolve the problem I have to delete my EFS volume completely and let docker compose recreate it... which isn't ideal.
I found some solutions saying to allow inbound 2049 (NFS) in the security group, but that seems irrelevant when my inbound security group is already allowing all traffic from all ports from within the security group.
I experience this issue as well during experimenting with docker compose and ECS.
What's frustrating is that things would be working fine for a while and then suddenly this issue would appear seemingly at random. To resolve the problem I have to delete my EFS volume completely and let docker compose recreate it... which isn't ideal.
I found some solutions saying to allow inbound 2049 (NFS) in the security group, but that seems irrelevant when my inbound security group is already allowing all traffic from all ports from within the security group.
Deleting the EFS and typing docker compose up
worked for me.
If anyone is getting the "failed to invoke EFS utils commands" error when attempting to reuse existing EFS volumes I've found a solution to this issue in this thread https://github.com/docker/compose-cli/issues/1739.
It's a bit hacky and requires some hard coding, but it solved a lot of EFS related issues for me.
I'm experiencing this too.
I'm letting docker compose make the volumes rather than using external ones, and I'm not overriding any of the automatically create resources.
It seems to fail the vast majority of the time, but not always. Around 1/10 docker compose up
's succeed.
It sounds very similar to this unresolved issue in efs-utils
: https://github.com/aws/efs-utils/issues/32
Also hitting this issue and I'm at < 10% success rate with a docker compose up
working with EFS. I've tried lettering compose create the volumes and I've tried with existing EFS volumes I've created and both have failed with this same error message
Have also been fighting this issue for days. There seems to be no reliable solution to it, other than giving up on ECS altogether and migrating to Kubernetes.
@Eternal21 Yes, my conclusion after discovering this and other breaking bugs was the Compose ECS context isn't ready yet. My solution was to switch to using AWS Copilot, which has its own quirks but seems to be better supported. It doesn't directly support Compose files unfortunately, but the API is quite simple and familiar.
@FraserThompson A shame really, but when something seems too good to be true, it usually is. Seems to be the case with being able to re-use compose files with ECS. I'm sure it works fine for some pet projects (as long as you are not using volumes apparently), but this is definitely not production ready.
I released ECS Compose-X weeks before both the ECS plugin and copilot came out. I thought indeed I'd be able to retire the project and just focus on other areas but from the issues in here and the fact that copilot created its own syntax instead of sticking to compose, I did not.
I have been using x-efs/volumes in compose-x for a while now, both tested with the ECS optimized AMI and Fargate, and haven't had much issues at all with mounting to the access points.
Also, I find the fact that this plugin treats all top volumes as EFS endpoints is, well not great, considering you can have shared ephemeral docker volumes for many use-cases. I do that all the time to load up configuration that was stored in S3/SSM etc. and leverage that for FireLens.
Would you guys (I suppose @Eternal21 and @FraserThompson ) share your compose files and indicate whether you expect to connect to an existing EFS, a new one, or if the volume is not supposed to be on EFS ? Just trying to capture use-cases wherever possible, as I don't intend to stop working on ECS Compose-X anytime soon. Use it for prod at work, so got to keep it interesting :)
@JohnPreston
Here's the smallest compose file I can reproduce the issue with. Happens regardless of whether I try to connect to a volume that already exists on EFS or one that needs to be created:
services:
mysql-db:
image: mysql:8.0.25
volumes:
- data:/var/lib/myslq
environment:
MYSQL_ROOT_PASSWORD: my_root_pass
MYSQL_DATABASE: my_db
MYSQL_USER: my_user
MYSQL_PASSWORD: my_pass
volumes:
data:
EDIT: I just had another go at it today, using the sample above and this time I can only reproduce the issue when the volume already exists (in other words it works the first time you compose up
the above, but not afterwards if you compose down
and then try to compose up
again. Possibly related to issue #1739 brought up earlier in the thread, although the error I'm getting is still:
MysqldbService TaskFailedToStart: ResourceInitializationError: failed to invoke EFS utils commands to set up EFS volumes: stderr: b'mount.nfs4: Connection reset by peer' : unsuccessful EFS utils command execution; code: 32
C:\...>ecs-minimal>docker compose up
level=warning msg="services.scale: unsupported attribute"
[+] Running 18/18
- ecs-minimal DeleteComplete 293.1s
- MysqldbTaskExecutionRole DeleteComplete 241.1s
- DataAccessPoint DeleteComplete 249.1s
- LogGroup DeleteComplete 241.1s
- DefaultNetwork DeleteComplete 257.1s
- Cluster DeleteComplete 239.1s
- CloudMap DeleteComplete 286.1s
- DataNFSMountTargetOnSubnet1 DeleteComplete 246.0s
- DataNFSMountTargetOnSubnet2 DeleteComplete 246.0s
- DataNFSMountTargetOnSubnetba3 DeleteComplete 245.0s
- DataNFSMountTargetOnSubnet4 DeleteComplete 246.0s
- DefaultNetworkIngress DeleteComplete 183.1s
- DataNFSMountTargetOnSubnet5 DeleteComplete 245.0s
- DataNFSMountTargetOnSubnet6 DeleteComplete 246.0s
- MysqldbTaskRole DeleteComplete 231.1s
- MysqldbTaskDefinition DeleteComplete 209.0s
- MysqldbServiceDiscoveryEntry DeleteComplete 187.0s
- MysqldbService DeleteComplete
Thanks @Eternal21
I added 2 things to your docker-compose file, as follows (and corrected data:/var/lib/myslq
to data:/var/lib/mysql
)
services:
mysql-db:
image: mysql:8.0.25
volumes:
- data:/var/lib/mysql
environment:
MYSQL_ROOT_PASSWORD: my_root_pass
MYSQL_DATABASE: my_db
MYSQL_USER: my_user
MYSQL_PASSWORD: my_pass
x-ecs:
EnableExecuteCommand: true
volumes:
data:
x-efs:
Properties:
LifecyclePolicies:
TransitionToIA: AFTER_14_DAYS
MacroParameters:
EnforceIamAuth: False
then did ecs-compose-x up -d templates -n eternal21-efs -f eternal21-efs.yaml
this created
x-ecs:
EnableExecuteCommand: true
That's enabled just to run command remotely as I deployed on Fargate
aws ecs execute-command --cluster eternal21-efs --interactive --task 01b7e2ade0d34215876bab1b504e1ca9 --command "ls -l /var/lib/mysql/"
The Session Manager plugin was installed successfully. Use the AWS CLI to start a session.
Starting session with SessionId: ecs-execute-command-033454ac104323321
total 198072
-rw-r----- 1 mysql mysql 196608 Jun 28 14:24 '#ib_16384_0.dblwr'
-rw-r----- 1 mysql mysql 8585216 Jun 28 14:24 '#ib_16384_1.dblwr'
drwxr-x--- 2 mysql mysql 6144 Jun 28 14:24 '#innodb_temp'
-rw-r----- 1 mysql mysql 56 Jun 28 14:24 auto.cnf
-rw-r----- 1 mysql mysql 3119469 Jun 28 14:24 binlog.000001
-rw-r----- 1 mysql mysql 156 Jun 28 14:24 binlog.000002
-rw-r----- 1 mysql mysql 32 Jun 28 14:24 binlog.index
-rw------- 1 mysql mysql 1676 Jun 28 14:24 ca-key.pem
-rw-r--r-- 1 mysql mysql 1112 Jun 28 14:24 ca.pem
-rw-r--r-- 1 mysql mysql 1112 Jun 28 14:24 client-cert.pem
-rw------- 1 mysql mysql 1676 Jun 28 14:24 client-key.pem
-rw-r----- 1 mysql mysql 5401 Jun 28 14:24 ib_buffer_pool
-rw-r----- 1 mysql mysql 50331648 Jun 28 14:24 ib_logfile0
-rw-r----- 1 mysql mysql 50331648 Jun 28 14:24 ib_logfile1
-rw-r----- 1 mysql mysql 12582912 Jun 28 14:24 ibdata1
-rw-r----- 1 mysql mysql 12582912 Jun 28 14:25 ibtmp1
drwxr-x--- 2 mysql mysql 6144 Jun 28 14:24 my_db
drwxr-x--- 2 mysql mysql 6144 Jun 28 14:24 mysql
-rw-r----- 1 mysql mysql 31457280 Jun 28 14:24 mysql.ibd
drwxr-x--- 2 mysql mysql 14336 Jun 28 14:24 performance_schema
-rw------- 1 mysql mysql 1680 Jun 28 14:24 private_key.pem
-rw-r--r-- 1 mysql mysql 452 Jun 28 14:24 public_key.pem
-rw-r--r-- 1 mysql mysql 1112 Jun 28 14:24 server-cert.pem
-rw------- 1 mysql mysql 1676 Jun 28 14:24 server-key.pem
drwxr-x--- 2 mysql mysql 6144 Jun 28 14:24 sys
-rw-r----- 1 mysql mysql 16777216 Jun 28 14:24 undo_001
-rw-r----- 1 mysql mysql 16777216 Jun 28 14:24 undo_002
Exiting session with sessionId: ecs-execute-command-033454ac104323321.
aws ecs execute-command --cluster eternal21-efs --interactive --task 01b7e2ade0d34215876bab1b504e1ca9 --command "mysql -umy_user -pmy_pass my_db"
The Session Manager plugin was installed successfully. Use the AWS CLI to start a session.
Starting session with SessionId: ecs-execute-command-05144ac9523411f9e
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 8.0.25 MySQL Community Server - GPL
Copyright (c) 2000, 2021, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> show tables;
Empty set (0.00 sec)
mysql> create table tata(user_id int);
Query OK, 0 rows affected (0.15 sec)
mysql>
Exiting session with sessionId: ecs-execute-command-05144ac9523411f9e.
I uploaded the docker-compose file used and the rendered templates for CFN at s3://public.compose-x.io/github.com/eternal21/efs/eternal21-efs.yaml # docker-compose file s3://public.compose-x.io/github.com/eternal21/efs/eternal21-efs.zip # CFN templates
EDIT1: I added x-efs on the volume because by default I consider the docker volume to be "a normal docker volume", not EFS, which IMHO, is not the right thing to do.
EDIT2: although the IAM auth is turned off, which is only because no specific userid/grouid is set for user
in the compose file, it still created an access point for the service, and added the IAM permissions to initiate the secure IAM based auth over TLS to the endpoint etc.
@JohnPreston Thanks. I was a little confused about the x-ecs
tag, because I didn't see it in the docker compose ECS documentation, but after googling around, I realize that you are talking about a completely separate tool: https://docs.compose-x.io/index.html#index--page-root
I don't have python on my machine, so can't test it at the moment, but if I ever decide to use Docker with ECS in the future this seems like the way to go, since the official tooling simply does not work.
@JohnPreston Thanks. I was a little confused about the
x-ecs
tag, because I didn't see it in the docker compose ECS documentation, but after googling around, I realize that you are talking about a completely separate tool: https://docs.compose-x.io/index.html#index--page-root I don't have python on my machine, so can't test it at the moment, but if I ever decide to use Docker with ECS in the future this seems like the way to go, since the official tooling simply does not work.
Sorry, yes, my bad on that, I had it linked in my previous answer, but yes, ECS Compose-X is a different tool all together. If you don't have python on your machine but have docker, which I guess you do, you can run the command with
docker run --rm -it -v <path to your windows aws home>:/root/.aws -v <equivalent of $PWD>:/tmp public.ecr.aws/compose-x/compose-x:latest --help
See https://gallery.ecr.aws/compose-x/compose-x
I am trying to add / update content in the "labs" for various use-cases, these days mostly a lot of kafka, and this year focusing on adding a lot of monitoring / logging related content. So if there is ever a use-case, feel free to ping me.
Good luck in your journey with ECS
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Stay alive
still occurs
This issue has been automatically marked as not stale anymore due to the recent activity.
I am using the compose / ECS integration and with the compose example below I am randomly getting this CloudFormation exception:
I am not able to recreate the problem constantly. It appears to be some sort of race condition that only occurs randomly. I am opening this issue just for the benefit of letting others people (searching for the same error string) find this issue and +1 it (should they see it) for further investigation.
This is an example of when the issue occurs and in the very next
up
with the same compose it works):This is the compose I am using (for reference see this Stackoverflow question)