hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.81k stars 1.94k forks source link

[feature] allow host_volumes to be mounted using overlayfs #6627

Open shantanugadgil opened 4 years ago

shantanugadgil commented 4 years ago

Nomad version

Nomad v0.10.1 (829f9af35c77d564b3dab74454eeba9bf25e2df8)

Operating system and Environment details

CentOS 7/8 Ubuntu 16.04/18.04

Issue

This is a feature request.

Reproduction steps

This is a feature request. Currently bind mounts are supported for exec, etc. The feature request if for overlayfs so that even raw_exec can use it.

Job file (if appropriate)

The following are the config parameters that I have tried with. I have made up some fields which could make sense in OverlayFS context.

Ref: https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt for terminology 'lowerdir', 'upperdir', 'workdir', 'merged'.

In the agent config:

client {
  enabled = true

  host_volume "repo_host" {
    path      = "/repo"
    read_only = true
  }

...

in the job definition:


group "work" {
    volume "repo_group" {
      source    = "repo_host"
      type      = "host"
      read_only = true
    }

    task "builder1" {
      driver = "raw_exec"

      volume_mount {
    # the "volume" would be essentially be the "lowerdir" of the OverlayFS mount
        volume      = "repo_group"
        type   = "overlayfs"  # proposed

        # the upper, work and merged directories can be possibly created automatically
        # "inside" the 'destination' directory.
        # example: /my_dir/upper, /my_dir/work, /my_dir/merged
    destination = "/my_dir" 
      }
...

With the above setup, the "raw_exec" task can use the "merged" directory (/my_dir/merged) as a COW location to do temporary writes.

Example: The lowerdir (/repo on the host), should eventually get mounted to /my_dir/merged on the host itself.

The use case for the above is as follows:

"/repo" is some large sized source code (gigs) which cannot be easily "cloned" into every build task (it'll take too much time and disk space).

Multiple tasks will be running, which would like to do a git pull before they start building from the source code.

Hence, the directory is mounted and the build tasks can do their own "COW" 'git pull' for latest code, build from there and then the overlayfs mount can be removed.

FWIW, some other job/entity will ensure that "/repo" itself is updated periodically (say, once-a-week) to keep the pulls of each of the task small.

Thoughts?

Regards, Shantanu Gadgil

possibly related: https://github.com/hashicorp/nomad/issues/1546 https://github.com/hashicorp/nomad/issues/2355

shantanugadgil commented 4 years ago

After the Nomad Community Office Hours: @eveld @shoenig @drewbailey (the live chat ate my question around this, I think)

My use case a is a simple build pipeline using the COW directory per dockerized build task.

Currently I make do using shell commands to do the necessary.

shantanugadgil commented 4 years ago

cross referencing https://github.com/hashicorp/nomad/issues/8262