SUSE / Portus

Authorization service and frontend for Docker registry (v2)
http://port.us.org/
Apache License 2.0
3k stars 472 forks source link

"Insufficient scope" when trying to upload image #2239

Closed Kulturserver closed 4 years ago

Kulturserver commented 5 years ago

We are unable to upload an image in Version 2.5.0-dev@a1b9f2ebfeb84680a9dcd5629195e4c52815735c and get the message "error authorizing context: insufficient scope" - changing the registry according to the issues https://github.com/SUSE/Portus/issues/1736 or https://github.com/SUSE/Portus/issues/2216 did unfortunately not solve the problem. If anybody has another idea, help would be highly appreciated. Thanks!

172.19.0.6 - - [09/Oct/2019:08:24:25 +0000] "POST /v2/sa/php7.3-nginx/blobs/uploads/ HTTP/1.0" 401 232 "" "docker/19.03.0-beta2 go/go1.12.4 git-commit/c601560 kernel/4.19.0-5-amd64 os/linux arch/amd64 UpstreamClient(Docker-Client/19.03.0-beta2 (linux))"time="2019-10-09T08:24:25Z" level=warning msg="error authorizing context: insufficient scope" go.version=go1.7.6 http.request.host=hub.culturebase.org http.request.id=55a0f7a2-5a7c-4fd9-88d4-6ad3242dbd68 http.request.method=POST http.request.remoteaddr=87.123.130.120 http.request.uri="/v2/sa/php7.3-nginx/blobs/uploads/ " http.request.useragent="docker/19.03.0-beta2 go/go1.12.4 git-commit/c601560 kernel/4.19.0-5-amd64 os/linux arch/amd64 UpstreamClient(Docker-Client/19.03.0-beta2 (linux))" instance.id=128b2192-fd36-417eacb9-0cbb3d5ed356 vars.name="sa/php7.3-nginx" version=v2.6.2-14-ga66a4c3 172.19.0.6 - - [09/Oct/2019:08:24:25 +0000] "POST /v2/sa/php7.3-nginx/blobs/uploads/ HTTP/1.0" 401 232 "" "docker/19.03.0-beta2 go/go1.12.4 git-commit/c601560 kernel/4.19.0-5-amd64 os/linux arch/amd64 UpstreamClient(Docker-Client/19.03.0-beta2 (linux))"

Kulturserver commented 5 years ago

Unfortunately, we didn't manage to solve this problem yet. It seems that the registry isn't able to contact the portus instance. It would be great, if anybody had a proposition how we could track this behaviour in detail. Thanks!

Kulturserver commented 5 years ago

Sorry to keep insisting, but we still aren't able to get beyond this error. Any help would be appreciated. Thanks!

mssola commented 5 years ago

It must be a configuration error somewhere. Are you using any of the provided examples, by any chance? Could you paste here the configuration that you are using (e.g. docker-compose.yml, etc.)? Be extra careful with getting fqdn's right and generating certificates properly.

Kulturserver commented 5 years ago

Thanks for your reply! As for the configuration, here is our docker-compose.yml:

version: "2"

services:
  portus:
    image: opensuse/portus:head
    environment:
      - PORTUS_MACHINE_FQDN_VALUE=${MACHINE_FQDN}

      # DB. The password for the database should definitely not be here. You are
      # probably better off with Docker Swarm secrets.
      - PORTUS_DB_HOST=db
      - PORTUS_DB_DATABASE=portus_production
      - PORTUS_DB_PASSWORD=${DATABASE_PASSWORD}
      - PORTUS_DB_POOL=5

      # Secrets. It can possibly be handled better with Swarm's secrets.
      - PORTUS_SECRET_KEY_BASE=${SECRET_KEY_BASE}
      - PORTUS_KEY_PATH=/certificates/portus.key
      - PORTUS_PASSWORD=${PORTUS_PASSWORD}

      # SSL
      - PORTUS_PUMA_TLS_KEY=/certificates/portus.key
      - PORTUS_PUMA_TLS_CERT=/certificates/portus.crt

      # NGinx is serving the assets instead of Puma. If you want to change this,
      # uncomment this line.
      #- RAILS_SERVE_STATIC_FILES='true'
    ports:
      - 3000:3000
    links:
      - db
    volumes:
      - ./secrets:/certificates:ro
      - static:/srv/Portus/public

  background:
    image: opensuse/portus:head
    depends_on:
      - portus
      - db
    environment:
      # Theoretically not needed, but cconfig's been buggy on this...
      - CCONFIG_PREFIX=PORTUS
      - PORTUS_MACHINE_FQDN_VALUE=${MACHINE_FQDN}

      # DB. The password for the database should definitely not be here. You are
      # probably better off with Docker Swarm secrets.
      - PORTUS_DB_HOST=db
      - PORTUS_DB_DATABASE=portus_production
      - PORTUS_DB_PASSWORD=${DATABASE_PASSWORD}
      - PORTUS_DB_POOL=5

      # Secrets. It can possibly be handled better with Swarm's secrets.
      - PORTUS_SECRET_KEY_BASE=${SECRET_KEY_BASE}
      - PORTUS_KEY_PATH=/certificates/portus.key
      - PORTUS_PASSWORD=${PORTUS_PASSWORD}

      - PORTUS_BACKGROUND=true
    links:
      - db
    volumes:
      - ./secrets:/certificates:ro

  db:
    image: library/mariadb:10.0.23
    command: mysqld --character-set-server=utf8 --collation-server=utf8_unicode_ci --init-connect='SET NAMES UTF8;' --innodb-flush-log-at-trx-commit=0
    environment:
      - MYSQL_DATABASE=portus_production

      # Again, the password shouldn't be handled like this.
      - MYSQL_ROOT_PASSWORD=${DATABASE_PASSWORD}
    volumes:
      - /var/opt/docker/data/mariadb:/var/lib/mysql

  registry:
    image: library/registry:2.6
    command: ["/bin/sh", "/etc/docker/registry/init"]
    environment:
      # Authentication
      REGISTRY_AUTH_TOKEN_REALM: https://${MACHINE_FQDN}/v2/token
      REGISTRY_AUTH_TOKEN_SERVICE: ${MACHINE_FQDN}
      REGISTRY_AUTH_TOKEN_ISSUER: ${MACHINE_FQDN}
      REGISTRY_AUTH_TOKEN_ROOTCERTBUNDLE: /secrets/portus.crt

      # SSL
      REGISTRY_HTTP_TLS_CERTIFICATE: /secrets/portus.crt
      REGISTRY_HTTP_TLS_KEY: /secrets/portus.key

      # Portus endpoint
      REGISTRY_NOTIFICATIONS_ENDPOINTS: >
        - name: portus
          url: https://${MACHINE_FQDN}/v2/webhooks/events
          timeout: 2000ms
          threshold: 5
          backoff: 1s
    volumes:
      - /var/opt/docker/data/registry:/var/lib/registry
      - ./secrets:/secrets:ro
      - ./registry/config.yml:/etc/docker/registry/config.yml:ro
      - ./registry/init:/etc/docker/registry/init:ro
    ports:
      - 5000:5000
      - 5001:5001 # required to access debug service
    links:
      - portus:portus

  nginx:
    image: library/nginx:alpine
    volumes:
      - ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
      - ./secrets:/secrets:ro
      - static:/srv/Portus/public:ro
    ports:
      - 81:80
      - 444:443
    links:
      - registry:registry
      - portus:portus

volumes:
  static:
    driver: local

In the same directory, there is a .env-file with the following content: `MACHINE_FQDN=images.culturebase.org

SECRET_KEY_BASE=jköjjkljoiuoiuzuizuizuizpuzpuoijoiuzzitzz

PORTUS_PASSWORD=password-for-portus

DATABASE_PASSWORD=dbpassword`

and those are further subdirectories:

mssola commented 4 years ago

In principle this looks fine (unless I'm missing something). Usually these kinds of errors occur because of wrong certificates, for example (e.g. the FQDN of the certificate not being set to the one configured on the registry's or the portus' config, badly generated certificates, etc.). So, one tip I can give you is to generate different kinds of certificates with different FQDN values (or better yet, try out non-self signed certificates on a staging environment). Otherwise, and just to clear out silly mistakes, can you post the log of docker-compose when running with this configuration?

Jean-Baptiste-Lasselle commented 4 years ago

culturebase.org

HI @Kulturserver , maybe this will help you : https://github.com/SUSE/Portus/issues/2264#issuecomment-565684494 . Did you have any progress with using portus ?

Jean-Baptiste-Lasselle commented 4 years ago
      REGISTRY_NOTIFICATIONS_ENDPOINTS: >
        - name: portus
          url: https://${MACHINE_FQDN}/v2/webhooks/events
          timeout: 2000ms
          threshold: 5
          backoff: 1s

@Kulturserver I think I found your mistake in your configuration, I made the exact same, but trickier, you will understand why :

      REGISTRY_NOTIFICATIONS_ENDPOINTS: >
        - name: portus
          url: https://${MACHINE_FQDN}:3000/v2/webhooks/events
          timeout: 2000ms
          threshold: 5
          backoff: 1s

instead of :

      REGISTRY_NOTIFICATIONS_ENDPOINTS: >
        - name: portus
          url: https://${MACHINE_FQDN}/v2/webhooks/events
          timeout: 2000ms
          threshold: 5
          backoff: 1s

see details in https://github.com/SUSE/Portus/issues/2264#issuecomment-565684494

Hope it helps, and the team did not die since Nov. 15 ....

@mssola What do you think?

p.s.: Ouh, the trickier thing I fell on is that inmy automated provisoning recipe of portus, additionnally to the :

      REGISTRY_NOTIFICATIONS_ENDPOINTS: >
        - name: portus
          url: https://${MACHINE_FQDN}/v2/webhooks/events
          timeout: 2000ms
          threshold: 5
          backoff: 1s

In the docker-compose.yml file, I also changed content of the config file config.yml of the registry, and there I had not forgotten to add the port number in the url.Took me a night ofsleep, to suddenly remerber I had duplicate configurations, and they did not match, not mentioning one probably overides the other. That one was though. But taught me a lot.

Kulturserver commented 4 years ago

Hi Jean-Baptiste, thanks a lot for your reply! Unfortunately, changing the registry notifications endpoints like you suggested didn't solve our problem yet, but we won't give up on this...

Kulturserver commented 4 years ago

@mssola : Here are the logs you requested, maybe they can shed a light on the problem...it would also be very helpful for us if there is a way to check and verify certificates? registry.log portus.log

Jean-Baptiste-Lasselle commented 4 years ago

culturebase.org

HI @Kulturserver , maybe this will help you : #2264 (comment) . Did you have any progress with using portus ?

Hi @Kulturserver a pleasure, no prob I'm just sharing the understanding and thrill I had those last 7 days I spent discovering portus. I'm happy and will be eager to share experience on operating production portus services, so I'll help you further, and i'm curious to know the bottom issue you experience :

I'll check for your future files and infos, really curious to find out if I got every thing good, and tackle down your problem.

One thing to mention, in my tests, i explicitly use opensuse/portus:2.5 image Ok @Kulturserver I'm finished with editing my comment here, can now read.

Jean-Baptiste-Lasselle commented 4 years ago

@mssola : Here are the logs you requested, maybe they can shed a light on the problem...it would also be very helpful for us if there is a way to check and verify certificates? registry.log portus.log

To check your certifcates, it's very simple :

curl  -vvv -X GET https://images.culturebase.org:3000
# the port number is really important, unless you have a reverseproxy.
# the machine from where your execute that curl command should be the machine where you execute [docker push]
# curl will then tell you if there's any problem with the certificate, and if 
# succeeds, will tell you it completed OK with code `200` and display in
# stdout some HTML

your portus certificate

johnbl@poste-devops-jbl-16gbram:~$ curl  -X GET https://images.culturebase.org:3000
<!DOCTYPE html><html><head><title>Portus</title><link rel="stylesheet" media="all" href="/assets/application-5ec524d1b8387e235cddaf6865636a81244d85dc42a549d2beee334ad18d5c45.css" /><meta name="csrf-param" content="authenticity_token" />
<meta name="csrf-token" content="+3iv64HvlT44jYI+w+DFPQu5nSk1TUxJjSTavAvZ/Nlw4PQvxqZk3b6PkGwpr4rIsMYV0b5ns1I+0xbtxE7wdA==" /><meta content="width=device-width, initial-scale=1" name="viewport" /><link href="/favicon/apple-touch-icon-57x57.png" rel="apple-touch-icon" sizes="57x57" /><link href="/favicon/apple-touch-icon-60x60.png" rel="apple-touch-icon" sizes="60x60" /><link href="/favicon/apple-touch-icon-72x72.png" rel="apple-touch-icon" sizes="72x72" /><link href="/favicon/apple-touch-icon-76x76.png" rel="apple-touch-icon" sizes="76x76" /><link href="/favicon/apple-touch-icon-114x114.png" rel="apple-touch-icon" sizes="114x114" /><link href="/favicon/apple-touch-icon-120x120.png" rel="apple-touch-icon" sizes="120x120" /><link href="/favicon/apple-touch-icon-144x144.png" rel="apple-touch-icon" sizes="144x144" /><link href="/favicon/apple-touch-icon-152x152.png" rel="apple-touch-icon" sizes="152x152" /><link href="/favicon/apple-touch-icon-180x180.png" rel="apple-touch-icon" sizes="180x180" /><link href="/favicon/favicon-32x32.png" rel="icon" sizes="32x32" type="image/png" /><link href="/favicon/android-chrome-192x192.png" rel="icon" sizes="192x192" type="image/png" /><link href="/favicon/favicon-96x96.png" rel="icon" sizes="96x96" type="image/png" /><link href="/favicon/favicon-16x16.png" rel="icon" sizes="16x16" type="image/png" /><link href="/favicon/manifest.json" rel="manifest" /><link color="#205683" href="/favicon/safari-pinned-tab.svg" rel="mask-icon" /><link href="/favicon/favicon.ico" rel="shortcut icon" /><meta content="Portus" name="apple-mobile-web-app-title" /><meta content="Portus" name="application-name" /><meta content="#da532c" name="msapplication-TileColor" /><meta content="/favicon/mstile-144x144.png" name="msapplication-TileImage" /><meta content="/favicon/browserconfig.xml" name="msapplication-config" /><meta content="#205683" name="theme-color" /><script src="/assets/webpack/vendors~application~unauthenticated-a7e8ea1bd318df554ba9.chunk.js" defer="defer"></script><script src="/assets/webpack/unauthenticated-239514f83440caff5036.js" defer="defer"></script></head><body class="login" data-controller="auth/sessions" data-route="auth/sessions/new"><div class="container-fluid vue-root"><section class="row-0"><div class="center-panel"><div class="col-md-4 col-sm-2 col-xs-1"></div><div class="col-md-4 col-sm-8 col-xs-10 text-center"><div class="collapse alert-wrapper" id="float-alert"><div class="alert alert-dismissible fade in text-left alert-info float-alert"><button class="close alert-hide" type="button"><span aria-hidden="true">&times;</span><span class="sr-only">Close</span></button><div class="alert-message"><div class="alert-icon pull-left"><i class="fa fa-3x fa-info-circle"></i></div><p></p></div></div></div><img class="login-picture" src="/assets/layout/portus-logo-login-page-c7ac1df0bc28985c89e8bb9f7b555013dd72f2da762c088af26717a170ff46f3.png" /><form class="new_user" id="new_user" action="/users/sign_in" accept-charset="UTF-8" method="post"><input name="utf8" type="hidden" value="&#x2713;" /><input type="hidden" name="authenticity_token" value="veI3YN0y+Lyw34DBp6EhmmWG+WOuXOsQVHdYtXX/ETysuRnar+RnLmy4bNblIyIqERDLVygCJDwEd/7fkGa1fQ==" /><input class="input form-control input-lg first" placeholder="Username" autofocus="autofocus" required="required" type="text" value="" name="user[username]" id="user_username" /><input class="input form-control input-lg last" placeholder="Password" autocomplete="off" required="required" type="password" name="user[password]" id="user_password" /><button name="button" type="submit" id="login-btn" class="classbutton btn btn-primary btn-block btn-lg"><i class="fa fa-check"></i>Login </button><div class="row"><div class="col-sm-4 create-new-account"><a class="btn btn-link" href="/users/sign_up">Create a new account</a></div><div class="col-sm-4 explore"><a id="explore" class="btn btn-link" title="Explore existing images from this registry" href="/explore">Explore</a></div><div class="col-sm-4 forgot-password"><a class="btn btn-link" href="/users/password/new">I forgot my password</a></div></div></form></div></div></sejohnbl@poste-devops-jbl-16gbram:~$ 

now checking your registry certificate

And it's fine, I can even tell (thank u curl :cupid: ) you issued a wildcard Certificate, because you share it between registry and portus web app.So it's fine, just plainly read curl 's logs, i could not put it in better words :

jibl@poste-devops-jbl-16gbram:~$ curl -vvv -X GET https://images.culturebase.org:5000/
Note: Unnecessary use of -X or --request, GET is already inferred.
*   Trying 95.216.40.15...
* TCP_NODELAY set
* Connected to images.culturebase.org (95.216.40.15) port 5000 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=*.culturebase.org
*  start date: Jun 12 00:00:00 2019 GMT
*  expire date: Aug 10 12:00:00 2020 GMT
*  subjectAltName: host "images.culturebase.org" matched cert's "*.culturebase.org"
*  issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=Thawte TLS RSA CA G1
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x555673c90ea0)
> GET / HTTP/1.1
> Host: images.culturebase.org:5000
> User-Agent: curl/7.52.1
> Accept: */*
> 
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 200 
< cache-control: no-cache
< content-type: text/plain; charset=utf-8
< content-length: 0
< date: Mon, 16 Dec 2019 19:27:32 GMT
< 
* Curl_http_done: called premature == 0
* Connection #0 to host images.culturebase.org left intact
Jean-Baptiste-Lasselle commented 4 years ago

@Kulturserver I found another problem in your configuration, un comment this line in your docker-compose.yml ( for the portus container) :

    - RAILS_SERVE_STATIC_FILES='true'
Error compiling CSS Assets
"SassC::SyntaxError: Error: File to import not found or unreadable: raleway."         on line 3 of 'app/assets/stylesheets/application.scss' >> @import "raleway";

"/srv/Portus/app/assets/stylesheets/application.scss:3"

I think your instance on 3000 port is a different (lab) instance, from your production env at port 80, isn't it?

Kulturserver commented 4 years ago

Thanks a lot for your effort, @Jean-Baptiste-Lasselle ! We've checked the issues you reported, but are still running into trouble unfortunately.

server { listen 443 ssl; listen [::]:443 ssl; ssl_certificate /etc/nginx/ssl/STAR_culturebase_org.crt; ssl_certificate_key /etc/nginx/ssl/STAR_culturebase_org.key; server_name images.culturebase.org; index index.php; include include/php.conf; access_log /var/log/nginx/images.culturebase.org-access.log; error_log /var/log/nginx/images.culturebase.org-error.log; location / { proxy_pass https://localhost:444; proxy_set_header Host $http_host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; } }

As you can see, the docker_background container is missing here. I'm attaching another log as you with regarding stderr/stdout.

Thanks again! portus2.log

Jean-Baptiste-Lasselle commented 4 years ago

hi @Kulturserver thank you so much for the provided details, I'll have a look at that tomorrow, and will return with further analysis asap.

Here I'll just state out a few facts we gathered, message me for any one of them being wrong :

last work on nginx :

@mssola will perhaps say I 'm wrong, I haven't discussed that with him and he's got much more project history knowledge, but i'll state this : the nginx in the original docker-compose example distributed with portus, is just meant to be an http server serving static content, not as a reverse-proxy. Maybe we'll see that better together later, but it's this MACHINE_FQDN variabble that is so confusing about the network setup, and which made you replace it with images.culturebase.org, totally natural. About that, consider first that the recipe does not work out of the box, so yes, there a bit of work to do on that one, I believe.

One thing I noticed about REGISTRY_AUTH_TOKEN_REALM (and good candidate for final success)

Oh, for same reason than for registry notifications, please change this in your registry configuration :

    - REGISTRY_AUTH_TOKEN_REALM: https://${MACHINE_FQDN}/v2/token
    - REGISTRY_AUTH_TOKEN_REALM: https://${MACHINE_FQDN}:3000/v2/token

One thing I noticed in your docker ps stdout

where is your background container ? If it's not just a copy-paste typo, you don't have a background container up n running, and a problem.

Kulturserver commented 4 years ago

Thanks again!

OK, the portus_Background_container process is not running. I found the following error-messages by running docker logs docker_background_1:

/usr/bin/bundle:23:in load': cannot load such file -- /usr/lib64/ruby/gems/2.6.0/gems/bundler-1.16.4/exe/bundle (LoadError) from /usr/bin/bundle:23:in

' Database ready [schema] Selected the schema for mysql [Mailer config] Host: images.culturebase.org [Mailer config] Protocol: https:// Creating scope :owner. Overwriting existing method TeamUser.owner. Creating scope :contributor. Overwriting existing method TeamUser.contributor. Creating scope :viewer. Overwriting existing method TeamUser.viewer. /usr/bin/bundle:23:in load': cannot load such file -- /usr/lib64/ruby/gems/2.6.0/gems/bundler-1.16.4/exe/bundle (LoadError) from /usr/bin/bundle:23:in
' /srv/Portus/lib/portus/registry_client.rb:197:in `get_page': Something went wrong while fetching the catalog Response: [401]

  • {"errors":[{"code":"UNAUTHORIZED" ,"message":"authentication required","detail":[{"Type":"registry","Class":"","Name":"catalog","Action":"*"}]}]} (Portus::RegistryClient::RegistryError) from /srv/Portus/lib/portus/registry_client.rb:174:in paged_response' from /srv/Portus/lib/portus/registry_client.rb:121:incatalog' from /srv/Portus/lib/portus/background/sync.rb:63:in block in execute!' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/activerecord-5.2.3/lib/active_record/relation/batches.rb:70:in block (2 levels) in find_each' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/activerecord-5.2.3/lib/active_record/relation/batches.rb:70:in each' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/activerecord-5.2.3/lib/active_record/relation/batches.rb:70:inblock in find_each' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/activerecord-5.2.3/lib/active_record/relation/batches.rb:136:in block in find_in_batches' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/activerecord-5.2.3/lib/active_record/relation/batches.rb:238:inblock in in_batches' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/activerecord-5.2.3/lib/active_record/relation/batches.rb:222:in loop' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/activerecord-5.2.3/lib/active_record/relation/batches.rb:222:inin_batches' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/activerecord-5.2.3/lib/active_record/relation/batches.rb:135:in find_in_batches' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/activerecord-5.2.3/lib/active_record/relation/batches.rb:69:infind_each' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/activerecord-5.2.3/lib/active_record/querying.rb:11:in find_each' from /srv/Portus/lib/portus/background/sync.rb:62:inexecute!' from /srv/Portus/bin/background.rb:61:in block (2 levels) in <top required)>' from /srv/Portus/bin/background.rb:58:ineach' from /srv/Portus/bin/background.rb:58:in each_with_index' from /srv/Portus/bin/background.rb:58:inblock in top (required)>' from /srv/Portus/bin/background.rb:57:in loop' from /srv/Portus/bin/background.rb:57:in<top (required)>' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/railties-5.2.3/lib/rails/commands/runner/runner_command.rb:38:in load' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/railties-5.2.3/lib/rails/commands/runner/runner_command.rb:38:inperform' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor/command.rb:27:in run' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:ininvoke_command' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor.rb:387:in dispatch' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/railties-5.2.3/lib/rails/command/base.rb:65:inperform' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/railties-5.2.3/lib/rails/command.rb:46:in invoke' from /srv/Portus/vendor/bundle/ruby/2.6.0/gems/railties-5.2.3/lib/rails/commands.rb:18:in<top (required)>' from bin/rails:12:in require' from bin/rails:12:in
    ' [Initialization] Running: 'Registry events', Registry synchronization'

exit status 1

Jean-Baptiste-Lasselle commented 4 years ago

HI @Kulturserver (happy newyear), so, you see this in your background logs :

Something went wrong while fetching the catalog Response: [401]

{"errors":[{"code":"UNAUTHORIZED"
,"message":"authentication required","detail":[{"Type":"registry","Class":"","Name":"catalog","Action":"*"}]}]}
(Portus::RegistryClient::RegistryError)

That means your background service is requesting access the docker registry API, more exactly the catalog endpoint, exactly this one : https://docs.docker.com/registry/spec/api/#listing-repositories.

To use the catalog endpoint of the registry API, your background has to authenticate to registry, using both :

And all this accordingly to the authentication flow specified by docker themselves here : https://docs.docker.com/registry/spec/auth/token/

To consolidate our diagnosis here, I also point out to you, who is raising an error, according to the background logs :

(Portus::RegistryClient::RegistryError)

Okay, so the software component that catches the authentication / authorization error, is Portus::RegistryClient. Google Serach on Portus::RegistryClient, and you almost immediately find the ./lib/portus/registry_client.rb file in the portus github git repo, and according to the top comment n that source file :

  # RegistryClient is a a layer between Portus and the Registry. Given a set of
  # credentials, it's able to call to any endpoint in the registry API.

Now, I hope you are convinced that your background service fails authenticating to your registry service.

We can even tell you that your background container was asking for the list of all catalogs, because of "Action":"*" : yea pretty much makes sense, because your background was booting up,so kind of asking to registry All right, I'm starting work here, so to begin with, what's up there in the registry?.

So all in all, your background service fails its startup process, and therefore stops.

What I propose your next

First test (reproduceability)

Run this and confirm you don't have any change in the background service logs (exact same error) :

docker-compose up -d --force-recreate background && docker-compose logs -f

Second test (network configuration)

I also see today one mistake I think, in your docker-compose.yml. So I propose you to proceed with the following test steaps :

* Finally, run this and return here with logs : 
```bash
# A./ reloading updated configuration for the registry config 
# don't do this one if you don't have backup for your 
# docker image storage, and portus pgsql database 

docker-compose up -d --force-recreate registry 
# instead just [docker-compose restart registry], should be enough to update your container's ENV VAR runtime values

# B./ recreate and launch the background service
docker-compose up -d --force-recreate background && docker-compose logs -f background

Sorry for the delay in answer. Anyway, we now know you did have a problem with your background container, so this has to be solved, on the way to total success.

No waiting for your additional logs :)

Kulturserver commented 4 years ago

Hi,

I execute the following command and get an error-message:

docker-compose up -d --force-recreate registry ERROR: Service "registry" uses an undefined network "pipeline_portus"

our docker-compose.yml is actually this:

version: "2" services: portus: image: opensuse/portus:head environment:

volumes: static: driver: local

Jean-Baptiste-Lasselle commented 4 years ago

hi @Kulturserver ok, I am working on this new data tomorrow on.

One thing immediately, about :

Hi,

I execute the following command and get an error-message:

docker-compose up -d --force-recreate registry ERROR: Service "registry" uses an undefined network "pipeline_portus"

networks:
  pipeline_portus:
    aliases:
     - ${OCI_REGISTRY_SERVICE_FQDN}
networks:
  portus_net:
    aliases:
     - ${OCI_REGISTRY_SERVICE_FQDN}
docker-compose up -d --force-recreate registry
Jean-Baptiste-Lasselle commented 4 years ago

@Kulturserver Also, there are steps you did not proceed with :

Kulturserver commented 4 years ago

@Jean-Baptiste-Lasselle : thanks again for your reply! We've changed the entries according to your instructions, (I'm attaching the current yml-file) but are now encountering other problems:

Would you be interested in looking into this yourself? We could gain you access to the machine if you send us your SSH key - maybe this would speed things up. Thanks!

docker-compose.zip

Jean-Baptiste-Lasselle commented 4 years ago
  • docker_nginx doesn't find a registry now (error message: host registry not found)

  • docker_background still isn't started

  • now we can't even do a docker login externally to images.culturebase.org

Hi @Kulturserver Thank you for your feedback, I'll now dive into your docker-compose.zip, and may give you more instructions, but I can immediately tell you :

Now do this

Ok, so now, for your nginx to find the registry container again, it must know about the new network name I made you define for the registry container, that is to say the value of OCI_REGISTRY_SERVICE_FQDN inside the .env file.

For nginx to "know about your registry network name", you have to write it inside the ./nginx/nginx.conf configuration file.

The old network name you used was registry (same as your container name). The new network name is oci-registry.culturebase.devops

So, now, all you have to do is (UPDATE : wait your [nginx] config might be tricky, so I need to have a look at how it was, before any change) :

# in the current folder, you have the [docker-compose.yml] file
export OLD_NETNAME=registry
export NEW_NETNAME=oci-registry.culturebase.devops

echo "Update : wait your [nginx] config might be tricky, so I need to have a look at how it was, before any change "
exit 0
sed -i "s#$OLD_NETNAME#$NEW_NETNAME#g"  ./nginx/nginx.conf
docker-compose restart nginx
docker login images.culturebase.org
docker-compose logs -f nginx

What i'll do in the next days

I'll do a few tests at hoe with the docker-compose.zip you sent, and I'll tell you if I have anything else to suggest.

Hence the Update : nginx.conf file

It 'd be useful if I could check out the content of nginx.conf file

Jean-Baptiste-Lasselle commented 4 years ago

Would you be interested in looking into this yourself? We could gain you access to the machine if you send us your SSH key - maybe this would speed things up. Thanks!

@Kulturserver Okay I'll send you a public key tomorrow (Feb. 6), and I'll have a look into your servers. I run a few tests before and have a lot of work today. What do you think about setting up a chatops for your organization (culturebase.org) in https://gitter.im ? That would be a handy way to keep track and accountability of what we are doing on your organization infrastructure

@Kulturserver you can find me on https://gitter.im/ :

Then, on https://gitter.im :

So just a suggestion, your call, and :

Jean-Baptiste-Lasselle commented 4 years ago

@Kulturserver I just had a look at your docker-compose.yml, and well, yes, It would speed things up if I had direct access to the machine :

So all in all will have to define 2 public domain names for your organization :

Why I do advise, that there should be 2 different network names

jbl@poste-devops-jbl-16gbram:~/culturebase.org/ops/1$ docker login hub.docker.com
Username: pegasus.devops
Password: 
Error response from daemon: login attempt to https://hub.docker.com/v2/ failed with status: 404 Not Found
jbl@poste-devops-jbl-16gbram:~/culturebase.org/ops/1$ docker login docker.io
Login with your Docker ID to push and pull images from Docker Hub. If you don't have a Docker ID, head over to https://hub.docker.com to create one.
Username: pegasus.devops
Password: 
Error response from daemon: Get https://registry-1.docker.io/v2/: unauthorized: incorrect username or password
jbl@poste-devops-jbl-16gbram:~/culturebase.org/ops/1$ 
jbl@poste-devops-jbl-16gbram:~/culturebase.org/ops/1$ docker login registry-1.docker.io
Username: pegasus.devops
Password: 
Error response from daemon: Get https://registry-1.docker.io/v2/: unauthorized: incorrect username or password
jbl@poste-devops-jbl-16gbram:~/culturebase.org/ops/1$ 

My reasons are those of the docker team :

are, at least technically, two completely different things.

Actually, docker designers themselves agree, since they did draw two different boxes, for the registry and the (token-based) authorization service :

notbigfatdaemon

So Different, that in every standard issues (performance, security,etc...) we need to have completely different approach and solutions.

So let's just split them, to be able to manage them completely differently.

And if we split them, we get closer to scalability and resiliency as well.

Jean-Baptiste-Lasselle commented 4 years ago

Errata : Docker Hub

In Earlier messages, I quickly analyzed the workflow in place for hub.docker.com In my analysis, I deduced that docker.io is the authentication/authorization service in place.

I today checked that in fact, the token-issuer, and so the authentication / authorization service is at https://auth.docker.io/token : and that something you can easily check, cf. https://docs.docker.com/registry/spec/auth/token/#how-to-authenticate.

Errata : Auth Flow for Docker Auth Specs v2, and communication between registry and auth/athorization services

I also want to mention that the whole auth/authorization flow specified does not imply communication between registry and auth/authorization services : the registry trusts a token as long as it is signed and the registry recognizes the auth./authroization service signature.

So :

Never the less, I will here add more notes to give results on my tests with @Kulturserver :

Jean-Baptiste-Lasselle commented 4 years ago

@Kulturserver I today confirm you have setup in my private infra a running portus with all above characteristics.

So, now two possibilities :

Important : I'll bring online (a github pages) a whole guide about portus, with automated recipe in which i'll publish and maintain an ops/performance/security recommandations. There,you will find the recommandations updates about the setup we did for culturebase.org.

Last but not least : if culturebase.org would agree to add on the portus login page, a thank you to XXX with link to my personal website, I'd really appreciate :)

p.s.: Culturebase.org can also wait until I publish the guide I above mention, so it's gonna be like in 2/3weeks; And then,you still have to deal with those internal SSLCertificate management issues. Your call.

Kulturserver commented 4 years ago

@Jean-Baptiste-Lasselle Thanks again! Your public key is now on the server. We have a wildcard certificate for *.culturebase.org and a self-signed for the docker instances. There are some further informations that we prefer not to publish in this thread, so could you get in contact with us via the mailaddress github@culturebase.org ? Thanks!

Jean-Baptiste-Lasselle commented 4 years ago

@Kulturserver hi, got all of this, sending you an email tomorrrow, ttythen

stale[bot] commented 4 years ago

Thanks for all your contributions! This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.