Closed jedd closed 4 months ago
Hi @jedd! My understanding is that the insecure-registries
option on your dockerd
configuration should be sufficient and there's no Nomad configuration needed at this point. Nomad sends the request to the Docker API and it's up to dockerd
from there. The task driver even trims the https://
prefix off the image name if that's provided in the jobspec. What's are the client logs you're seeing around this error?
In any case, the docs very much need updating here.
Hey @tgross - thank you for a speedy response.
My humble apologies, as the problem seems to have been self-inflicted, with a misconfiguration of the my docker configuration (json). For context and maybe helpful with documentation updates I'll describe.
I am using 1.8.1, and have this as my image variable in my Nomad job: "dg-pan-01.int.jeddi.org:5001/pdc-agent:0.0.30"
- this is the 'registry-ui' service, which fronts my cncf 'distribution' registry (running on port 5000).
I'd be chopping and changing between the two endpoints as part of this build out.
I was getting the error:
Jul 19 11:31:38 dg-hac-01 nomad[281967]: 2024-07-19T11:31:38.528+1000 [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=0ce13148-a857-8915-8a63-494053ba3470 task=grafana-pdc error="Failed to pull `dg-pan-01.int.jeddi.org:5001/pdc-agent:0.0.30`: API error (500): Get \"https://dg-pan-01.int.jeddi.org:5001/v2/\": http: server gave HTTP response to HTTPS client"
But after your assurance that this should still work, and notably that it should exclusively be a docker configuration aspect, I revisited the docker configuration on my clients and noted I had just one endpoint specified:
"insecure-registries" : ["dg-pan-01.int.jeddi.org:5000"]
I've modified that to include both Distribution registry AND the registry web ui:
"insecure-registries" : ["dg-pan-01.int.jeddi.org:5000", "dg-pan-01.int.jeddi.org:5001"]
... and now my jobs are able to pull images locally over http.
I did have both in my Nomad's client.hcl configuration - but clearly mucked up on replicating that additional port out to the docker config.
If these are not needed (Nomad should defer exclusively to docker daemon's insecure-registries configuration, you're saying?) then I'll look to strip them out of Nomad config.
client {
enabled = true
options {
"docker.volumes.enabled" = "true"
"docker.insecure_registries" = "dg-pan-01.int.jeddi.org:5000,dg-pan-01.int.jeddi.org:5001"
"docker.tls.disabled" = "true"
}
Great to hear. I'll follow-up with an update to the documentation.
I've confirmed the behavior is as expected and will update the docs next. On my host nomad0.local
, I ran a registry container and pushed an image to it:
$ docker run -d -p 5000:5000 --name registry registry:2
$ docker image tag busybox:1 nomad0.local:5000/busybox:1
My /etc/docker/daemon.json
is:
{
"insecure-registries": ["nomad0.local:5000"]
}
And I was able to run the following job with no further Nomad client confguration:
job "example" {
group "group" {
task "task" {
driver = "docker"
config {
image = "nomad0.local:5000/busybox:1"
command = "httpd"
args = ["-vv", "-f", "-p", "8001", "-h", "/local"]
}
resources {
cpu = 50
memory = 50
}
}
}
}
Proposal
Enable - or better document the process if the capability currently actually exists - non-TLS/SSL (https) access to local / private docker registries.
Use-cases
Within local & dev environments, it's convenient to access docker registries without setting up SSL infrastructure around those endpoints.
This used to be possible I believe in 0.5.x epoch, prior to the removal of the 'ssl: false' capability in the docker driver stanza
Attempted Solutions
Varying approaches, including specifying both no URI type, and attempting to force http: type, on the
image =
parameter.This, combined with docker configuration including my non-TLS endpoints, eg:
combined with Nomad configuration for docker driver replicating / confirming this:
I note current documentation:
https://developer.hashicorp.com/nomad/docs/drivers/docker#SSL .. says this flag is deprecated since 0.5.3 - but has no indicators as to workarounds for that deprecation.
Tracking back to the bug / pr described: in https://github.com/hashicorp/nomad/pull/1336 does make it look like functionality was removed, rather than (as described) 'default was changed'.