Closed jonasbardino closed 4 months ago
In one of our development systems we've set up a 256MB tmpfs
scratch space defined in /etc/fstab
as:
tmpfs /storage-mem/mig_system_run tmpfs nosuid,nodev,noatime,noexec,uid=1000,gid=1000,mode=0770,size=256m 0 0
Then in the active docker-compose.yml
each container just links that location into place for simple shared use.
volumes:
[...]
# NOTE: mig_system_run is a shared volatile cache using host tmpfs
- /storage-mem/mig_system_run:/home/mig/state/mig_system_run
to significantly speed up operations on the caches.
The 256MB size
was chosen a bit arbitrarily as an example. We see actual data sizes in maybe a few tens of megabytes even on production systems, so one can probably pick any size from 64m if memory is very scarce, or leave out the size argument completely to let the system use the default percentage.
The uid
and gid
values match the default setup, and just need to be adjusted to fit if one uses different values in docker-migrid
so that all container services can read and write there.
We can add a note about the tmpfs
setup and corresponding commented out volume lines in docker-compose_production.yml
to ease use, but all deployments will need to act to enable it correctly.
Some thoughts on that:
The mounting of a tmpfs should be done in Ansible (or whatever admins use to deploy the environment of Migrid) Bjarke and I already talked about how that might be possible.
The directory which hold the cache data on the host should be configurable and documented in docker-migrid. Something like MIG_SYSTEM_RUN
maybe?
So the volume mount might look like:
volumes:
[...]
# NOTE: mig_system_run is a shared volatile cache using host tmpfs
- ${MIG_SYSTEM_RUN}:/home/mig/state/mig_system_run
The default could also be something like /tmp/mig_system_run
to enable the volatile behaviour by default.
Some distros even mount /tmp as a tmpfs iirc.
Could you explain a bit more what you mean with the status markers needing to be shared between services. We have the issue that we have multiple physical hosts running different Lustre services. Does the mig_system_run folder needs to be shared between them?
Sorry, I never got back to you on this one - and thanks for your valuable input, btw :+1:
Sure, it would be better to have it exposed as a variable and then handled where most appropriate on each site. I only started looking into basically integrating it hard-coded in our docker-compose files before your comment, but got sidetracked by other more urgent matters and vacation so it's still only half done :-/
For a few things like account expiry and suspension to fully kick-in the mig_system_run/
contents currently need to be shared or synchronized between all migrid containers for one site. Which in practice of course makes it tricky to use a local tmpfs if the individual migrid-X containers are distributed across multiple VMs/hosts. If you have the mig_system_run
directory on lustre you would still want to mount the same lustre location into each container or run some frequent synchronization on top. Hope that answers your question.
Implemented as suggested with a new MIG_SYSTEM_RUN
variable to define which directory should be bound to the internal state/mig_system_run
. It is documented in the variables doc and instructions for setting up and using a tmpfs mount for it are included with the variable in the provided .env
files.
ATTENTION integrators should make adjustments accordingly at least on production sites.
Various internals rely on the
mig_system_run
folder for storing volatile information like caches, session tracking and status markers. On production systems we use a fast scratch space intmpfs
for this purpose to speedup operations on those files. The contents are automatically re-generated on use and do not require persistence across restarts, so in-memory storage fits well. A fast flash-based storage could be another option depending on memory and storage availability.For e.g. the status markers to work between services the storage needs to be shared, however. Otherwise things like account suspension and expire will not transparently take affect in all containers.
In
docker-migrid
a similar fast scratch space should be integrated to improve performance. It cannot be completely automated because it requires the host to provide a suitable location and point the containers to use it.