Closed kinghuang closed 5 years ago
Here's a couple more tests. c4.large
and r4.large
with 20 GB volumes. Both seem normal.
c4.large
~ $ df -h /
Filesystem Size Used Available Use% Mounted on
overlay 19.7G 262.0M 18.4G 1% /
r4.large
~ $ df -h /
Filesystem Size Used Available Use% Mounted on
overlay 19.7G 262.0M 18.4G 1% /
System log from a m5.large
instance showing 3.7 GB for a 20 GB volume.
https://gist.github.com/kinghuang/9842ee461a6ebe9b66e47c0e1f3a6eb1
~ $ df -h /
Filesystem Size Used Available Use% Mounted on
overlay 3.7G 302.5M 3.5G 8% /
This seems to be the problematic bit that differs on m5
/c5
instances.
* Configuring host block device ... * ERROR: automount failed to start
https://gist.github.com/kinghuang/9842ee461a6ebe9b66e47c0e1f3a6eb1#file-system-log-L392
Thanks for the clear steps to replicate this @kinghuang
Another good way to see this is by adding lsblk
via apk --update add util-linux
The output on a c5/m5 machine will be:
/ # lsblk
/ #
Any other instance will be:
/ # lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvdb 202:16 0 100G 0 disk
└─xvdb1 202:17 0 100G 0 part /var
/ #
Turns out amazon is mounting the disk at a completely different location, which means that we fail to see it, and thus mount it 😔 https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/nvme-ebs-volumes.html
I'll update our support for NVMe in order to get this resolved.
@FrenchBen is this fixed?
can we deploy a stack using m5/c5 instances without this issue now?
Sorry for double checking but as the code for docker-for-aws template is not present in this repository we cannot find out when you make your changes.
I strongly suggest that you put the latest cloud formation template file in this repository.
I briefly tried the new Moby Linux 18.03.0-ce-aws2 image (ami-3260064a in us-west-2) a few days ago and it seems to be fixed. Haven't tested it extensively yet, though.
I tried 18.03.1-ce-aws1 (outside of docker for AWS) yesterday on c5.2xlarge instances, it does not work. The EBS volume is created and attached to the EC2 instance but the docker service never comes up and it fails to mount. Here are the relevant error messages from the docker service:
Aug 13 16:48:25 dockerd[9408]: level=error msg="time=\"2018-08-13T20:48:25Z\" level=error msg=\"could not fetch volume: Volume Not Found\" name=CloudStorTest-1-vol operation=get " plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:25 dockerd[9408]: level=error msg="time=\"2018-08-13T20:48:25Z\" level=info msg=\"Volume does not exist. Create fresh EBS\" name=CloudStorTest-1-vol operation=createEBS options=map[backing:relocatable ebstype:gp2 size:25] " plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg="time=\"2018-08-13T20:48:45Z\" level=info msg=\"Volume creation in new AZ succeeded: {" plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg=" AvailabilityZone: \"us-east-1b\"," plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg=" CreateTime: 2018-08-13 20:48:25.366 +0000 UTC," plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg=" Encrypted: false," plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg=" Iops: 100," plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg=" Size: 25," plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg=" SnapshotId: \"\"," plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg=" State: \"available\"," plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg=" Tags: [{" plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg=" Key: \"StackID\"," plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg=" Value: \"d41d8cd98f00b204e9800998ecf8427e\"" plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg=" },{" plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg=" Key: \"CloudstorVolumeName\"," plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg=" Value: \"CloudStorTest-1-vol\"" plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg=" }]," plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg=" VolumeId: \"vol-05abf79a8e6e6b4ea\"," plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg=" VolumeType: \"gp2\"" plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:48:45 dockerd[9408]: level=error msg="}\" name=CloudStorTest-1-vol operation=createNewEBS options=map[backing:relocatable ebstype:gp2 size:25] " plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:50:50 dockerd[9408]: level=error msg="00a6b0526771de63fe2ee4fabe9141c11b465f1b3f62d0838a3e8ddc6e7a8c56 cleanup: failed to delete container from containerd: no such container"
Aug 13 16:52:51 dockerd[9408]: level=error msg="00a6b0526771de63fe2ee4fabe9141c11b465f1b3f62d0838a3e8ddc6e7a8c56 cleanup: failed to delete container from containerd: no such container"
Aug 13 16:54:52 dockerd[9408]: level=error msg="00a6b0526771de63fe2ee4fabe9141c11b465f1b3f62d0838a3e8ddc6e7a8c56 cleanup: failed to delete container from containerd: no such container"
Aug 13 16:56:53 dockerd[9408]: level=error msg="00a6b0526771de63fe2ee4fabe9141c11b465f1b3f62d0838a3e8ddc6e7a8c56 cleanup: failed to delete container from containerd: no such container"
Aug 13 16:58:51 dockerd[9408]: level=error msg="time=\"2018-08-13T20:58:51Z\" level=error msg=\"Failed to attach volume: Volume never attached to Instance\" name=CloudStorTest-1-vol operation=mountEBS " plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:58:51 dockerd[9408]: level=error msg="time=\"2018-08-13T20:58:51Z\" level=error msg=\"error mounting volume: Volume never attached to Instance\" name=CloudStorTest-1-vol operation=mount " plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:58:51 dockerd[9408]: level=error msg="time=\"2018-08-13T20:58:51Z\" level=error msg=\"failed to probe volume FS: failed to open device to probe ext4: open /dev/xvdf: no such file or directory\" name=CloudStorTest-1-vol operation=mountEBS " plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:58:51 dockerd[9408]: level=error msg="time=\"2018-08-13T20:58:51Z\" level=error msg=\"error mounting volume: failed to open device to probe ext4: open /dev/xvdf: no such file or directory\" name=CloudStorTest-1-vol operation=mount " plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:58:51 dockerd[9408]: level=error msg="time=\"2018-08-13T20:58:51Z\" level=error msg=\"failed to probe volume FS: failed to open device to probe ext4: open /dev/xvdf: no such file or directory\" name=CloudStorTest-1-vol operation=mountEBS " plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:58:51 dockerd[9408]: level=error msg="time=\"2018-08-13T20:58:51Z\" level=error msg=\"error mounting volume: failed to open device to probe ext4: open /dev/xvdf: no such file or directory\" name=CloudStorTest-1-vol operation=mount " plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:58:51 dockerd[9408]: level=error msg="time=\"2018-08-13T20:58:51Z\" level=error msg=\"failed to probe volume FS: failed to open device to probe ext4: open /dev/xvdf: no such file or directory\" name=CloudStorTest-1-vol operation=mountEBS " plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
Aug 13 16:58:51 dockerd[9408]: level=error msg="time=\"2018-08-13T20:58:51Z\" level=error msg=\"error mounting volume: failed to open device to probe ext4: open /dev/xvdf: no such file or directory\" name=CloudStorTest-1-vol operation=mount " plugin=d281abf36c42252e979b4eccd5f3aa2c29175e89adf78ac930492d45f1a3efc9
The errors at the end may be side effects of me eventually Ctrl+C out of the docker service create
command when it would fail.
@jwitko That's a different problem. This issue is about the root block device for the host itself, not EBS volumes created by Cloudstor. This specific issue is fixed. But, Cloudstor needs to be similarly updated to handle the different mount paths used by current generation instances.
I'm going to close this one. The Cloudstor EBS problem is covered by #157.
@kinghuang
Hi
Recently I upgraded my aws instances type to r5.4xlarge and m5.2xlarge. Post modifying the instance types, I could see / file system utilization showing as high like 98% occupied. When I checked in the / file system I dont see any file consumed high. I`ve installed nvme and ENA modules are installed and loaded. Can you please help me to identify and fix the issue?Thanks
-Shyam
Expected behaviour
Docker for AWS uses Moby Linux AMIs. The Moby Linux 18.03 AMI (ami-ccf668b4) added ENA and NMVe support, making it usable with current generation EC2 instances (#128). However, there appears to be some sort of problem working with the instance's backing EBS volume in current generation instances. The size of the root file system seen in the instance is dramatically smaller than the size of the underlying EBS volume.
As a baseline, here is what's shown for a t2.micro instance with a 20 GB EBS volume.
Actual behavior
A
c5.large
instance with a 48 GB EBS volume shows 1.8 GB.A
m5.large
instance with a 20 GB EBS volume shows a size of 3.7 GB.For some unknown reason, the full space of the root block EBS isn't available. As a result, Docker nodes very quickly run out of disk space.
Information
The current Docker for AWS template doesn't have entries for
c5
andm5
instance types (#146). They can be manually added manually. For the purposes of this issue, it's more convenient to just directly launch EC2 instances with the Moby Linux 18.03 AMI and selectc5
/m5
instance types.Steps to reproduce the behavior
From the EC2 Management Console:
shell-aws
container).df -h /
. Observe the size of the root filesystem.