hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.98k stars 1.96k forks source link

Enable / document process for accessing insecure (non-https) Docker registries from Nomad #23616

Closed jedd closed 4 months ago

jedd commented 4 months ago

Proposal

Enable - or better document the process if the capability currently actually exists - non-TLS/SSL (https) access to local / private docker registries.

Use-cases

Within local & dev environments, it's convenient to access docker registries without setting up SSL infrastructure around those endpoints.

This used to be possible I believe in 0.5.x epoch, prior to the removal of the 'ssl: false' capability in the docker driver stanza

Attempted Solutions

Varying approaches, including specifying both no URI type, and attempting to force http: type, on the image = parameter.

This, combined with docker configuration including my non-TLS endpoints, eg:

  "insecure-registries" : ["my.internal.host:5000"]

combined with Nomad configuration for docker driver replicating / confirming this:

client {
  enabled = true

  options {
    "docker.volumes.enabled" = "true"
    "docker.insecure_registries" = "my.internal.host:5000"
    "docker.tls.disabled"  = "true"

   ...

I note current documentation:
https://developer.hashicorp.com/nomad/docs/drivers/docker#SSL .. says this flag is deprecated since 0.5.3 - but has no indicators as to workarounds for that deprecation.

Tracking back to the bug / pr described: in https://github.com/hashicorp/nomad/pull/1336 does make it look like functionality was removed, rather than (as described) 'default was changed'.

tgross commented 4 months ago

Hi @jedd! My understanding is that the insecure-registries option on your dockerd configuration should be sufficient and there's no Nomad configuration needed at this point. Nomad sends the request to the Docker API and it's up to dockerd from there. The task driver even trims the https:// prefix off the image name if that's provided in the jobspec. What's are the client logs you're seeing around this error?

In any case, the docs very much need updating here.

jedd commented 4 months ago

Hey @tgross - thank you for a speedy response.

My humble apologies, as the problem seems to have been self-inflicted, with a misconfiguration of the my docker configuration (json). For context and maybe helpful with documentation updates I'll describe.

I am using 1.8.1, and have this as my image variable in my Nomad job: "dg-pan-01.int.jeddi.org:5001/pdc-agent:0.0.30" - this is the 'registry-ui' service, which fronts my cncf 'distribution' registry (running on port 5000).

I'd be chopping and changing between the two endpoints as part of this build out.

I was getting the error:

Jul 19 11:31:38 dg-hac-01 nomad[281967]:     2024-07-19T11:31:38.528+1000 [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=0ce13148-a857-8915-8a63-494053ba3470 task=grafana-pdc error="Failed to pull `dg-pan-01.int.jeddi.org:5001/pdc-agent:0.0.30`: API error (500): Get \"https://dg-pan-01.int.jeddi.org:5001/v2/\": http: server gave HTTP response to HTTPS client"

But after your assurance that this should still work, and notably that it should exclusively be a docker configuration aspect, I revisited the docker configuration on my clients and noted I had just one endpoint specified:

  "insecure-registries" : ["dg-pan-01.int.jeddi.org:5000"]

I've modified that to include both Distribution registry AND the registry web ui:

  "insecure-registries" : ["dg-pan-01.int.jeddi.org:5000", "dg-pan-01.int.jeddi.org:5001"]

... and now my jobs are able to pull images locally over http.

I did have both in my Nomad's client.hcl configuration - but clearly mucked up on replicating that additional port out to the docker config.

If these are not needed (Nomad should defer exclusively to docker daemon's insecure-registries configuration, you're saying?) then I'll look to strip them out of Nomad config.

client {
  enabled = true

  options {
    "docker.volumes.enabled" = "true"
    "docker.insecure_registries" = "dg-pan-01.int.jeddi.org:5000,dg-pan-01.int.jeddi.org:5001"
    "docker.tls.disabled"  = "true"
  }
tgross commented 4 months ago

Great to hear. I'll follow-up with an update to the documentation.

tgross commented 4 months ago

I've confirmed the behavior is as expected and will update the docs next. On my host nomad0.local, I ran a registry container and pushed an image to it:

$ docker run -d -p 5000:5000 --name registry registry:2
$ docker image tag busybox:1 nomad0.local:5000/busybox:1

My /etc/docker/daemon.json is:

{
  "insecure-registries": ["nomad0.local:5000"]
}

And I was able to run the following job with no further Nomad client confguration:

job "example" {

  group "group" {

    task "task" {

      driver = "docker"

      config {
        image   = "nomad0.local:5000/busybox:1"
        command = "httpd"
        args    = ["-vv", "-f", "-p", "8001", "-h", "/local"]
      }

      resources {
        cpu    = 50
        memory = 50
      }

    }
  }
}