openfaas / faas-swarm

OpenFaaS provider for Docker Swarm
https://github.com/openfaas/faas
MIT License
81 stars 37 forks source link

Unexpected status: 400, message: Invalid registry auth #43

Closed JohnOllhorn closed 5 years ago

JohnOllhorn commented 5 years ago

Hello, folks,

I'm sorry, but I have to reopen the post.

I use the current version of OpenFaas together with Docker Swarm (multi Node) and a private Docker Registry.

I've read all contributions to this bug, but I still get this message:

> faas-cli deploy -f stack.yml --gateway https://DOMAIN --network proxy --readonly --send-registry-auth
Deploying: SERVICENAME.

Unexpected status: 400, message: Invalid registry auth

Function 'SERVICENAME' failed to deploy with status code: 400

I am successfully logged into my registry and OpenFaas

Originally posted by @JohnOllhorn in https://github.com/openfaas/faas-swarm/issues/19#issuecomment-452078919

alexellis commented 5 years ago

Hi John,

Sorry to hear you are having problems with your Swarm authentication.

Can you give us the full instructions in this new issue on exactly how to reproduce your issue without skipping any details or steps?

Alex

cc @burtonr

Here's the issue template we ask for:

Expected Behaviour

Current Behaviour

Possible Solution

Steps to Reproduce (for bugs)

1. 2. 3. 4.

Context

Your Environment

alexellis commented 5 years ago

@burtonr I saw that you were involved in this issue originally. Please can you help out John when we get all the steps to repro?

burtonr commented 5 years ago

Sure thing! @JohnOllhorn I'll need some more information to help though. Could you fill out the template that Alex posted? I'll see if I'm able to reproduce this in my environment

JohnOllhorn commented 5 years ago

Expected Behaviour

Deploy a Node10 Express functions, generated from a template, to a private Docker Registry with Basic Auth.

Pull this image into OpenFaas, behind Traefik reverse Proxy

Current Behaviour

$ faas-cli deploy -f stack.yml --gateway https://DOMAIN --network proxy --readonly --send-registry-auth
Deploying: SERVICENAME.

Unexpected status: 400, message: Invalid registry auth

Function 'SERVICENAME' failed to deploy with status code: 400

Possible Solution

I have no idea

Steps to Reproduce (for bugs)

Docker Login

docker login <PRIVATE-REGISTRY>:5000
Username: <USERNAME>
Password:
WARNING! Your password will be stored unencrypted in /home/XYZ/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded

OpenFaas Login

faas-cli login --gateway https://<OPENFAAS-URL> -u <USERNAME> -p <PASSWORD>
WARNING! Using --password is insecure, consider using: cat ~/faas_pass.txt | faas-cli login -u user --password-stdin
Calling the OpenFaaS server to validate the credentials...
credentials saved for OpenFaasAdmin https://<OPENFAAS-URL>

Build

faas-cli build -f stack.yml --no-cache
Successfully built 17a40eae2043
Successfully tagged <PRIVATE-REGISTRY>:5000/stack:latest
Image: <PRIVATE-REGISTRY>:5000/stack:latest built.
[0] < Building stopwords done.
[0] worker done.

Push

faas-cli push -f stack.yml --tag latest
[0] > Pushing stopwords [<PRIVATE-REGISTRY>:5000/stack:latest].
The push refers to repository [<PRIVATE-REGISTRY>:5000/stack]
b797d662d8d1: Layer already exists
276cc36fd274: Layer already exists
559dd4ede191: Layer already exists
a3e65a263a7c: Layer already exists
16a58715fbd5: Layer already exists
1dfdd2956332: Layer already exists
ec3990d97ad4: Layer already exists
5d21e3cd8548: Layer already exists
b80b5348f47c: Layer already exists
a6a1da62c825: Layer already exists
89099b8fbac6: Layer already exists
cb18274b2dd1: Layer already exists
df64d3292fd6: Layer already exists
latest: digest: sha256:83a52709c9c63bf2e1a1e975cbe7429c01fede5ee2f30ff8d9885fade8034ff2 size: 3034
[0] < Pushing stopwords [<PRIVATE-REGISTRY>:5000/stack:latest] done.
[0] worker done.

Deploy (FAIL!)

faas-cli deploy -f stack.yml --gateway https://<OPENFAAS-URL> --network proxy --readonly --send-registry-auth
Deploying: stack.

Unexpected status: 400, message: Invalid registry auth

Function 'stack' failed to deploy with status code: 400

Context

Your Environment

* FaaS-CLI version ( Full output from: `faas-cli version` ):

___                   _____           ____
/ _ \ _ __   ___ _ __ |  ___|_ _  __ _/ ___|
| | | | '_ \ / _ \ '_ \| |_ / _` |/ _` \___ \
| |_| | |_) |  __/ | | |  _| (_| | (_| |___) |
\___/| .__/ \___|_| |_|_|  \__,_|\__,_|____/
|_|

CLI: commit: 5b36c50280d961d3aa248bd17f901f4ef2774447 version: 0.8.1


>     * Docker version `docker version` (e.g. Docker 17.0.05 ):
```bash
Client:
 Version:           18.09.0
 API version:       1.39
 Go version:        go1.10.4
 Git commit:        4d60db4
 Built:             Wed Nov  7 00:49:01 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.0
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.4
  Git commit:       4d60db4
  Built:            Wed Nov  7 00:16:44 2018
  OS/Arch:          linux/amd64
  Experimental:     true
* Are you using Docker Swarm or Kubernetes (FaaS-netes)?

Docker Swarm

* Operating System and version (e.g. Linux, Windows, MacOS):

Linux Ubuntu 18.04

* Link to your project or a code example to reproduce issue:
alexellis commented 5 years ago

Please can you provide your config for Traefik and the registry?

alexellis commented 5 years ago

I think you may be missing the username on your image

:5000/stack:latest

Should be:

:5000/owner/stack:latest

Can you try as above and let us know?

Alex

JohnOllhorn commented 5 years ago

I think you may be missing the username on your image

:5000/stack:latest

Should be:

:5000/owner/stack:latest

Can you try as above and let us know?

Alex

What username should I put in there? The admin user of the private registry? The push of the image into the registry works, but the pull of OpenFaas does not, (because of the Auth, I think so)

Please can you provide your config for Traefik and the registry?

version: "3.7"
services:
  # The reverse proxy service (Traefik)
  traefik:
    image: <PRIVATE-REGISTRY>:5000/traefik:latest
    command: -c /dev/null --logLevel="INFO" --api --ping /
      --defaultentrypoints=http,https /
      --docker=true /
      --docker.watch=true /
      --docker.swarmmode=true /
      --docker.domain=<DOMAIN> /
      --acme=true /
      --acme.email=info@<DOMAIN> /
      --acme.domains='<DOMAIN>, api.<DOMAIN>, faas.<DOMAIN>' /
      --acme.storage=/etc/traefik/acme/acme.json /
      --acme.httpChallenge.entryPoint=http /
      --acme.entryPoint=https /
      --acme.onhostrule=true /
      --entryPoints='Name:http Address::80' /
      --entryPoints='Name:https Address::443 TLS'
    configs:
      - source: traefik.toml
        target: /etc/traefik/traefik.toml
    networks: ['proxy']
    ports:
      - 80:80
      - 443:443
      - 8080:8080
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - traefik_acme:/etc/traefik/acme
    deploy:
      placement:
        constraints: ['node.role == manager']

#####################################################################################

  gateway:
    image: <PRIVATE-REGISTRY>:5000/faas-gateway:latest
    environment:
      functions_provider_url:   "http://faas-swarm:8080/"
      read_timeout:             "300s"  # Maximum time to read HTTP request
      write_timeout:            "300s"  # Maximum time to write HTTP response
      upstream_timeout:         "300s"  # Maximum duration of upstream function call - should be more than read_timeout and write_timeout
      dnsrr:                    "true"  # Temporarily use dnsrr in place of VIP while issue persists on PWD
      faas_nats_address:        "nats"
      faas_nats_port:           4222
      direct_functions:         "true"  # Functions are invoked directly over the overlay network
      direct_functions_suffix:  ""
      basic_auth:               "true"
      secret_mount_path:        "/run/secrets/"
      scale_from_zero:          "true"
    networks: ['proxy']
    deploy:
      labels:
        traefik.enable:         "true"
        traefik.backend:        "faas"
        traefik.port:           8080
        traefik.frontend.rule:  "Host:faas.<DOMAIN>"
        traefik.docker.network: 'proxy'
      resources:
        # limits: # Enable if you want to limit memory usage
          # memory: 200M
        reservations:
          memory: 100M
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 20
        window: 380s
      placement:
        constraints: ['node.platform.os == linux']
    secrets: ['basic-auth-user', 'basic-auth-password']

    # Docker Swarm provider
  faas-swarm:
    image: <PRIVATE-REGISTRY>:5000/faas-swarm:latest
    volumes: ['/var/run/docker.sock:/var/run/docker.sock']
    networks: ['proxy']
    environment:
      read_timeout:       "300s"  # set both here, and on your functions
      write_timeout:      "300s"  # set both here, and on your functions
      DOCKER_API_VERSION: "1.39"
      basic_auth:         "${BASIC_AUTH:-true}"
      secret_mount_path:  "/run/secrets/"
    deploy:
      labels: ['traefik.enable=false']
      placement:
        constraints: ['node.role == manager', 'node.platform.os == linux']
      resources:
        # limits: # Enable if you want to limit memory usage
          # memory: 100M
        reservations:
          memory: 100M
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 20
        window: 380s
    secrets: ['basic-auth-user', 'basic-auth-password']

  nats:
    image: <PRIVATE-REGISTRY>:5000/nats-streaming:latest
    # Uncomment the following port mappings if you wish to expose the NATS client and/or management ports you must also add `-m 8222` to the command
    # ports:
      # - 4222:4222
      # - 8222:8222
    command: "--store memory --cluster_id faas-cluster"
    networks: ['proxy']
    deploy:
      labels: ['traefik.enable=false']
      resources:
        limits:
          memory: 125M
        reservations:
          memory: 50M
      placement:
        constraints: ['node.platform.os == linux']

  queue-worker:
    image: <PRIVATE-REGISTRY>:5000/faas-queue-worker:latest
    networks: ['proxy']
    environment:
      max_inflight:       "1"
      ack_wait:           "300s"  # Max duration of any async task / request
      basic_auth:         "${BASIC_AUTH:-true}"
      secret_mount_path:  "/run/secrets/"
    deploy:
      labels: ['traefik.enable=false']
      resources:
        limits:
          memory: 50M
        reservations:
          memory: 20M
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 20
        window: 380s
      placement:
        constraints: ['node.platform.os == linux']
    secrets: ['basic-auth-user', 'basic-auth-password']

#####################################################################################

  prometheus:
    image: <PRIVATE-REGISTRY>:5000/prometheus:latest
    environment:
      no_proxy: 'gateway'
    configs:
      - source: prometheus.yml
        target: /etc/prometheus/prometheus.yml
      - source: alert.rules.yml
        target: /etc/prometheus/alert.rules.yml
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      # - '--storage.tsdb.path /data'
    volumes: ['prometheus_data:/data']
    ports: ['9090:9090']
    networks: ['proxy']
    deploy:
      labels:
        traefik.enable:         'false'
        traefik.backend:        'prometheus'
        traefik.port:           9090
        traefik.frontend.rule:  'Host:prometheus.<DOMAIN>'
        traefik.docker.network: 'proxy'
      placement:
        constraints: ['node.role == manager', 'node.platform.os == linux']
      resources:
        limits:
          memory: 500M
        reservations:
          memory: 200M

  alertmanager:
    image: <PRIVATE-REGISTRY>:5000/alertmanager:latest
    environment:
      no_proxy: "gateway"
    command:
      - '--config.file=/alertmanager.yml'
      - '--storage.path=/alertmanager'
    networks: ['proxy']
    # Uncomment the following port mapping if you wish to expose the Prometheus Alertmanager UI.
    # ports:
      # - 9093:9093
    deploy:
      labels: ['traefik.enable=false']
      resources:
        limits:
          memory: 50M
        reservations:
          memory: 20M
      placement:
        constraints: ['node.role == manager', 'node.platform.os == linux']
    configs:
      - source: alertmanager.yml
        target: /alertmanager.yml

#####################################################################################

volumes:
  traefik_acme:
  prometheus_data:

configs:
  traefik.toml:
    file: ./traefik/traefik.toml
  prometheus.yml:
    file: ./prometheus/prometheus.yml
  alert.rules.yml:
    file: ./prometheus/alert.rules.yml
  alertmanager.yml:
    file: ./prometheus/alertmanager.yml

networks:
  proxy:
    name: proxy
    driver: overlay
    attachable: true
    labels: ['openfaas=true']

secrets:
  basic-auth-user:
    external: true
  basic-auth-password:
    external: true
docker run -d --name docker-registry --restart=always -p 5000:5000 -v `pwd`/auth:/auth -e "REGISTRY_AUTH=htpasswd" -e "REGISTRY_AUTH_HTPASSWD_REALM=Registry Realm" -e REGISTRY_AUTH_HTPASSWD_PATH=/auth/htpasswd -e REGISTRY_HTTP_ADDR=0.0.0.0:5000 -e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/fullchain.pem -e REGISTRY_HTTP_TLS_KEY=/certs/privkey.pem -v /certs:/certs -v /var/lib/docker/registry:/var/lib/registry registry:2

Docker Service

docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE                                              PORTS
sze6fhi4s7po        stack           replicated          0/1                 <PRIVATE-REGISTRY>:5000/stack:latest           
docker service update sze6fhi4s7po
sze6fhi4s7po
overall progress: 0 out of 1 tasks 
1/1: No such image: <PRIVATE-REGISTRY>:5000/stack:latest
burtonr commented 5 years ago

@JohnOllhorn apologies for the delayed response. With that additional information, I can see what the trouble is.

If you're using a custom repository, you must supply a user/repository name. This is because official templates (without a user or repository name, ie ubuntu:latest are hosted on Docker Hub. OpenFaaS does it's best to figure out where you're trying to pull and push images from, but this is not one of the cases that is supported at the moment.

Here is the test file that we use for the registry auth to give you an idea of all the supported registry options:

// custom repository with valid data
    testValidEncodedAuthConfig(t, "user", "password", "my.repository.com/user/imagename", "my.repository.com")
    testValidEncodedAuthConfig(t, "user", "password", "my.repository.com/user/imagename:v0.1", "my.repository.com")
    testValidEncodedAuthConfig(t, "user", "password", "my.repository.com/user/imagename:latest", "my.repository.com")
    testValidEncodedAuthConfig(t, "user", "weird:password:", "my.repository.com/user/imagename", "my.repository.com")
    testValidEncodedAuthConfig(t, "userWithNoPassword", "", "my.repository.com/user/imagename", "my.repository.com")
    testValidEncodedAuthConfig(t, "", "", "my.repository.com/user/imagename", "my.repository.com")

    // docker hub default repository
    testValidEncodedAuthConfig(t, "user", "password", "user/imagename", "docker.io")
    testValidEncodedAuthConfig(t, "user", "password", "user/imagename:v0.1", "docker.io")
    testValidEncodedAuthConfig(t, "user", "password", "user/imagename:latest", "docker.io")
    testValidEncodedAuthConfig(t, "user", "password", "docker.io/user/imagename", "docker.io")
    testValidEncodedAuthConfig(t, "user", "password", "docker.io/user/imagename:v0.1", "docker.io")
    testValidEncodedAuthConfig(t, "user", "password", "docker.io/user/imagename:latest", "docker.io")
    testValidEncodedAuthConfig(t, "", "", "docker.io/user/imagename", "docker.io")

func testValidEncodedAuthConfig(t *testing.T, user, password, imageName, expectedRegistryHost string) {
...

If I update the test config to your scenario, the test fails as it's looking to pull the image from docker.io.

Additionally, here is the original PR comment where figuring out the path is more clearly shown: https://github.com/openfaas/faas-swarm/pull/21#pullrequestreview-118506096

I also have a private repository that I use. I always add my username to the image in order for it to better follow the standard of {repository URL} / {username} / {image name} : {version}

Your private repo should support this syntax. You can easily update and re-push your image with the following command:

docker tag stack:latest john/stack:latest

I'd be willing to put in some time to add this additional use-case if you find this solution inconvenient, or incorrect and the project members agree that it should be added.

JohnOllhorn commented 5 years ago
Deploying: stack.

Deployed. 202 Accepted.
URL: https://<DOMAIN>/function/stack

I thank you many many times! and no you don't need to change anything about your excellent work because of my stupidity!

burtonr commented 5 years ago

Awesome news! Glad it's working for you.

It's not stupidity, but rather a lack of documentation or features. I'll add something to the documentation (once I think of where it should go) so that the next person doesn't have the same issue.

burtonr commented 5 years ago

Derek close: resolved

alexellis commented 5 years ago

Glad this is resolved! Was my suggestion correct here about using the owner/username prefix?

Cheers,

Alex

burtonr commented 5 years ago

@alexellis That's right. With private repositories, a username is required for the image name.

alexellis commented 5 years ago

Do we have anything we can update in the docs or troubleshooting guide?