Open ghost opened 4 years ago
+1
+1
ECS Fargate provides 3 times more memory (in the max configuration) than Lambda (30 GB vs 10 GB). This makes Fargate attractive for processing large datasets in-memory for performance and security reasons (especially if those datasets contain sensitive data).
tmpfs
is a straightforward way to mount a memory-based filesystem on OS level and expose it to userspace applications for using their standard FS methods.
Please enable it for Fargate.
Amazon, seriously ? Including the option of readonly root filesystems and then not implementing tmpfs is like selling cars without seatbelts. What on earth were you thinking ?
This is maddening, because there's not even a workaround possible if you run any more than one instance of a container.
It's tempting to think an EFS share might be a suitable alternative mountable "temp" space, but the problem arises with the /tmp folder when you have a service that has say 4 tasks associated and added to a load balancer. Mapping a shared EFS /tmp folder across the 4 instances is a madman's choice given how often applications consider the space as their own private slop space.
Honestly, this is such a no brainer requirement that "rootfs as read-only" without it is next to useless in all but trivial cases.
As @pspot2 said above, these containers aren't short on memory ... the smallest one you can get is 512MB. The overwhelming majority of containers we deploy use at most 128MB in practice. It's not like it should cost more to offer.
Follow up - there's not a doubt in my mind that the following is a god-awful hack that AWS should be embarrassed about putting forward when people ask for tmpfs, but here it is ... a workaround in case anyone else needs it.
These two basically have you bind-mounting a host volume into the container you want the tmpfs volume to be, and then declaring a second container outside the scope of the main one that runs busybox and changes the owner / perms on the volume before the main one launches so its writeable. The conditional dependsOn attribute is needed to make the containers launch in the right order so the perms are set on start.
I feel dirty just writing this out, but hopefully it will help someone else who is waiting for AWS to offer tmpfs on fargate the way they certainly know they should.
@rickknowles-cognitant and everyone else who stumbles onto this PR. There is a better different solution! It's possible to have this permission sidecar / container definitions business done within the Dockerfile definition rather than requiring the new image.
By default, the volume permissions are set to 0755 and the owner as root. These permissions can be changed in the Dockerfile. In the following example, the owner of the /var/log/exported directory is set to node.
FROM public.ecr.aws/amazonlinux/amazonlinux:latest
RUN yum install -y shadow-utils && yum clean all
RUN useradd node
RUN mkdir -p /var/log/exported && chown node:node /var/log/exported
RUN touch /var/log/exported/examplefile
USER node
VOLUME ["/var/log/exported"]
thanks to @jsclarridge who found the snippet above in the AWS documentation. https://docs.aws.amazon.com/AmazonECS/latest/developerguide/bind-mounts.html
We have a requirement where we want to scan files for viruses with ClamAV. But the files may contain sensitive data and cannot be written on disk. We would like to use Fargate, but since it still not allows tmpfs we currently see no easy solution. Would be great to have this in the near future.
Sharing an experimental workaround here, I am also consulting support on this, our use case is server-side render cache that we would like to be fast and also terminates with container's lifecycle.
Summary:
/tmp
and shared memory at /dev/shm
, ln
at runtimeln -s
dead softlinks at build-time into the containerAt docker build time Dockerfile
, create links but do not actually write files, build time shm is not runtime shm, data written to build time shm are lost as not being saved in container layers
RUN ln -s /dev/shm/your/required/folder/structure <your-desired-tmp-dir>
At run-time docker-entrypoint.sh
, copy files from read-only filesystem into /dev/shm desired subdirectory before running app
mkdir -p /dev/shm/your/required/folder/structure
cp $HOME/your/file /dev/shm/your/required/folder/structure/ -r
Beware this is not a persistence storage, all these tmp files are gone when container is stopped, no recovery possible.
Dockerfile
FROM public.ecr.aws/docker/library/node:16-alpine
# Can use
# - /tmp: 30GB overlay
# - /dev/shm: all available memory to container
ENV TMPDIR=/tmp
WORKDIR /root
RUN ln -s ${TMPDIR} ./linked-tmp
RUN apk add --no-cache bash
COPY ./docker-entrypoint.sh /usr/local/bin/docker-entrypoint.sh
RUN chmod +x /usr/local/bin/docker-entrypoint.sh
ENTRYPOINT ["/usr/local/bin/docker-entrypoint.sh"]
docker-entrypoint.sh
#!/bin/bash
set -euo pipefail
export FARGATE_TASK_ROOT=/root
export TMPDIR=/tmp
cd $FARGATE_TASK_ROOT || exit 1
pwd
df -h
ls -la ./
touch ./linked-tmp/asd
ls -l "./linked-tmp"
ls -l $TMPDIR
for i in {1..20000}; do
dd if=/dev/urandom of=./linked-tmp/$i.dat bs=10M count=10
df -h $TMPDIR
done
df -h
shows 2 file system, Size
of /dev/shm can show larger than the task/container specification, however possibly cgroup limit block write beyond what is allocated to the container.
# sample output on 4vcpu 8GB config
Filesystem Size Used Available Use% Mounted on
overlay 29.4G 9.6G 18.3G 34% /
tmpfs 64.0M 0 64.0M 0% /dev
shm 14.9G 0 14.9G 0% /dev/shm
....
Our target is to try to use overlay writable at /tmp
and memdisk at /dev/shm
, and are indeed writable through build-time created softlink
# testing at /dev/shm, written 100MB files utill 7.9GB and get killed, expected, cannot write beyond what I didn't paid for (8GB Memory in both task & service definition).
10+0 records in
10+0 records out
Filesystem Size Used Available Use% Mounted on
shm 14.9G 100.0M 14.8G 1% /dev/shm
10+0 records in
10+0 records out
Filesystem Size Used Available Use% Mounted on
shm 14.9G 200.0M 14.7G 1% /dev/shm
10+0 records in
10+0 records out
Filesystem Size Used Available Use% Mounted on
shm 14.9G 300.0M 14.6G 2% /dev/shm
...
10+0 records in
10+0 records out
Filesystem Size Used Available Use% Mounted on
shm 14.9G 7.7G 7.2G 52% /dev/shm
10+0 records in
10+0 records out
Filesystem Size Used Available Use% Mounted on
shm 14.9G 7.8G 7.1G 52% /dev/shm
10+0 records in
10+0 records out
Filesystem Size Used Available Use% Mounted on
shm 14.9G 7.9G 7.0G 53% /dev/shm
/usr/local/bin/docker-entrypoint.sh: line 16: 173 Killed dd if=/dev/urandom of=./linked-tmp/$i.dat bs=10M count=10
# testing at /tmp, written 100MB files all the way up to 30GB until run out of disk space
10+0 records in
10+0 records out
Filesystem Size Used Available Use% Mounted on
overlay 29.4G 9.6G 18.2G 34% /
10+0 records in
10+0 records out
Filesystem Size Used Available Use% Mounted on
overlay 29.4G 9.7G 18.2G 35% /
...
10+0 records in
10+0 records out
Filesystem Size Used Available Use% Mounted on
overlay 29.4G 29.2G 0 100% /
10+0 records in
10+0 records out
Filesystem Size Used Available Use% Mounted on
overlay 29.4G 29.3G 0 100% /
dd: error writing './linked-tmp/204.dat': No space left on device
Hi @rickknowles-cognitant , Hi @kftsehk , I am also looking for tmpfs support. I also found a part in the docs on supported ecs task-definition parameter: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html#container_definition_linuxparameters Here tmpfs is listed as a supported parameter. Isn't this the topic you are looking for? Or am I misleaded? Looking forward to your response
@helloworld121
Please see the Note in bold under details of the use of tmpfs parameter
If you're using tasks that use the Fargate launch type, the tmpfs parameter isn't supported.
It is not supported for Fargate launch type, which is exactly what is being requested in this issue.
Sharing an experimental workaround here, I am also consulting support on this, our use case is server-side render cache that we would like to be fast and also terminates with container's lifecycle.
TLDR
Summary:
- There is writable overlay at
/tmp
and shared memory at/dev/shm
,- we don't have a config that mount these to desired directory
- read-only filesystem also prevent us to
ln
at runtime- Workaround: we can
ln -s
dead softlinks at build-time into the container
Further analysis shows it is the mounting of a tmpfs device to custom path from task definition that is not supported, as tested we can have a soft-link hardcoded into container that point to default tmpfs /tmp or /dev/shm for in memory fs.
Community Note
Tell us about your request To write temporary data to a file in ReadonlyRootFilesystem mode, please support tmpfs on Fargate.
Which service(s) is this request for? Fargate
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? When Apache enables the MPM module, the parent process ID (pid) is written to the pidfile. In read-only mode, it cannot write data to a file and create a new file, so we must use the tmpfs feature. However, since Fargate does not support this feature, apache fails to start and outputs the following error message:
For security reasons, I do want to run the container in ReadonlyRootFilesystem mode. So, please support tmpfs.