n1b0r / docker-flow-proxy-letsencrypt

39 stars 16 forks source link

Combined cert not found #25

Open Vad1mo opened 6 years ago

Vad1mo commented 6 years ago

My DFPL gets this error from DFPLE and stops working because of the exception.

The cert example.container-stuff.com is on disk under /etc/letsencrypt however there is no secret as it was cleaned up. I was expecting that it will recover once the cert is needed again.

2018-02-28 08:47:02,356;ERROR;Certbot return code: 1. Skipping
2018-02-28 08:47:02,357;ERROR;Error while generating certs for [u'.container-stuff.com']
2018-02-28 08:47:02,368;ERROR;Combined certificate not found. Check logs for errors.

The exception is actually a HTML page, I just pasted the contents here in text format.

Exception
Exception: Combined cert not found

Traceback (most recent call last)
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1997, in __call__
                error = None
            ctx.auto_pop(error)

    def __call__(self, environ, start_response):
        """Shortcut for :attr:`wsgi_app`."""
        return self.wsgi_app(environ, start_response)

    def __repr__(self):
        return '<%s %r>' % (
            self.__class__.__name__,
            self.name,
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1985, in wsgi_app
        try:
            try:
                response = self.full_dispatch_request()
            except Exception as e:
                error = e
                response = self.handle_exception(e)
            except:
                error = sys.exc_info()[1]
                raise
            return response(environ, start_response)
        finally:
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1540, in handle_exception
            # if we want to repropagate the exception, we can attempt to
            # raise it with the whole traceback in case we can do that
            # (the function was actually called from the except part)
            # otherwise, we just raise the error again
            if exc_value is e:
                reraise(exc_type, exc_value, tb)
            else:
                raise e

        self.log_exception((exc_type, exc_value, tb))
        if handler is None:
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1982, in wsgi_app
        ctx = self.request_context(environ)
        ctx.push()
        error = None
        try:
            try:
                response = self.full_dispatch_request()
            except Exception as e:
                error = e
                response = self.handle_exception(e)
            except:
                error = sys.exc_info()[1]
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1614, in full_dispatch_request
            request_started.send(self)
            rv = self.preprocess_request()
            if rv is None:
                rv = self.dispatch_request()
        except Exception as e:
            rv = self.handle_user_exception(e)
        return self.finalize_request(rv)

    def finalize_request(self, rv, from_error_handler=False):
        """Given the return value from a view function this finalizes
        the request by converting it into a response and invoking the
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1517, in handle_user_exception
            return self.handle_http_exception(e)

        handler = self._find_error_handler(e)

        if handler is None:
            reraise(exc_type, exc_value, tb)
        return handler(e)

    def handle_exception(self, e):
        """Default exception handling that kicks in when an exception
        occurs that is not caught.  In debug mode the exception will
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1612, in full_dispatch_request
        self.try_trigger_before_first_request_functions()
        try:
            request_started.send(self)
            rv = self.preprocess_request()
            if rv is None:
                rv = self.dispatch_request()
        except Exception as e:
            rv = self.handle_user_exception(e)
        return self.finalize_request(rv)

    def finalize_request(self, rv, from_error_handler=False):
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1598, in dispatch_request
        # request came with the OPTIONS method, reply automatically
        if getattr(rule, 'provide_automatic_options', False) \
           and req.method == 'OPTIONS':
            return self.make_default_options_response()
        # otherwise dispatch to the handler for that endpoint
        return self.view_functions[rule.endpoint](**req.view_args)

    def full_dispatch_request(self):
        """Dispatches the request and on top of that performs request
        pre and postprocessing as well as HTTP exception catching and
        error handling.
File "/app/app.py", line 81, in reconfigure
            if 'letsencrypt.testing' in args:
                testing = args['letsencrypt.testing']
                if isinstance(testing, basestring):
                    testing = True if testing.lower() == 'true' else False

            client.process(args['letsencrypt.host'].split(','), args['letsencrypt.email'], testing=testing)

    # proxy requests to docker-flow-proxy
    # sometimes we can get an error back from DFP, this can happen when DFP is not fully loaded.
    # resend the request until response status code is 200 (${RETRY} times waiting ${RETRY_INTERVAL} seconds between retries)
    t = 0
File "/app/client_dfple.py", line 184, in process

            combined = [x for x in certs if '.pem' in x]
            if len(combined) == 0:
                logger.error('Combined certificate not found. Check logs for errors.')
                # raise Exception to make a 500 response to dpf, and make it retry the request later.
                raise Exception('Combined cert not found')
            combined = combined[0]

            if self.docker_client == None:
                if created:
                    # no docker client provided, use docker-flow-proxy PUT request to update certificate
Exception: Combined cert not found
This is the Copy/Paste friendly version of the traceback. You can also paste this traceback into a gist: 

Traceback (most recent call last): File "/usr/local/lib/python2.7/site-packages/flask/app.py",
                    line 1997, in __call__ return self.wsgi_app(environ, start_response) File "/usr/local/lib/python2.7/site-packages/flask/app.py",
                    line 1985, in wsgi_app response = self.handle_exception(e) File "/usr/local/lib/python2.7/site-packages/flask/app.py",
                    line 1540, in handle_exception reraise(exc_type, exc_value, tb) File "/usr/local/lib/python2.7/site-packages/flask/app.py",
                    line 1982, in wsgi_app response = self.full_dispatch_request() File "/usr/local/lib/python2.7/site-packages/flask/app.py",
                    line 1614, in full_dispatch_request rv = self.handle_user_exception(e) File "/usr/local/lib/python2.7/site-packages/flask/app.py",
                    line 1517, in handle_user_exception reraise(exc_type, exc_value, tb) File "/usr/local/lib/python2.7/site-packages/flask/app.py",
                    line 1612, in full_dispatch_request rv = self.dispatch_request() File "/usr/local/lib/python2.7/site-packages/flask/app.py",
                    line 1598, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/app/app.py",
                    line 81, in reconfigure client.process(args['letsencrypt.host'].split(','), args['letsencrypt.email'],
                    testing=testing) File "/app/client_dfple.py", line 184, in process raise Exception('Combined
                    cert not found') Exception: Combined cert not found
The debugger caught an exception in your WSGI application. You can now look at the traceback which led to the error. If you enable JavaScript you can also use additional features such as code execution (if the evalex feature is enabled), automatic pasting of the exceptions and much more.
Brought to you by DON'T PANIC, your friendly Werkzeug powered traceback interpreter.
Console Locked
The console is locked and needs to be unlocked by entering the PIN. You can find the PIN printed out on the standard output of your shell that runs the server.

PIN:  

Service Definition:

proxy-le:
    image: nib0r/docker-flow-proxy-letsencrypt
    networks:
      - net
    environment:
      - DF_PROXY_SERVICE_NAME=proxy_proxy
      # - LOG=debug
      # - CERTBOT_OPTIONS=--staging
    volumes:
      # link docker socket to activate secrets support.
      - /var/run/docker.sock:/var/run/docker.sock
      # create a dedicated volume for letsencrypt folder.
      # MANDATORY to keep persistent certificates on DFPLE.
      # Without this volume, certificates will be regenerated every time DFPLE is recreated.
      # OPTIONALY you will be able to link this volume to another service that also needs certificates (gitlab/gitlab-ce for example)
      - le-certs:/etc/letsencrypt
    deploy:
      replicas: 1
      placement:
        constraints: [node.role == manager]      
      labels:
        - com.df.notify=true
        - com.df.distribute=true
        - com.df.servicePath=/.well-known/acme-challenge
        - com.df.port=8080
n1b0r commented 6 years ago

Are you trying to use a "real" certificate for your proxied service ?

If you already have a certificate for your proxied service, you should not use the letsencrypt service and you docker-flow-proxy configuration options to make it work.

Ping me if I misunderstood.

Vad1mo commented 6 years ago

DFPLE created the cert in the first place. While the service wasn't deployed the cert secrets got removed. However it was still in the volume /etc/letsencrypt. When the service was deployed again I started to see this error.

alexanderkjeldaas commented 6 years ago

I'm getting the same issue. What's is the solution?

alexanderkjeldaas commented 6 years ago
2018-06-27 15:27:07,751:DEBUG:urllib3.connectionpool:https://acme-v01.api.letsencrypt.org:443 "POST /acme/new-authz HTTP/1.1" 429 189
2018-06-27 15:27:07,753:DEBUG:acme.client:Received response:
HTTP 429
Server: nginx
Content-Type: application/problem+json
Content-Length: 189
Boulder-Requester: 37309940
Replay-Nonce: _CbgUiTR9OKABryvqum0Ua0_jBU8vSBZjLdw8smwt74
Expires: Wed, 27 Jun 2018 15:27:07 GMT
Cache-Control: max-age=0, no-cache, no-store
Pragma: no-cache
Date: Wed, 27 Jun 2018 15:27:07 GMT
Connection: close

{
  "type": "urn:acme:error:rateLimited",
  "detail": "Error creating new authz :: too many failed authorizations recently: see https://letsencrypt.org/docs/rate-limits/",
  "status": 429
}
2018-06-27 15:27:07,753:DEBUG:acme.client:Storing nonce: _CbgUiTR9OKABryvqum0Ua0_jBU8vSBZjLdw8smwt74
2018-06-27 15:27:07,754:ERROR:certbot.log:Exiting abnormally:
Traceback (most recent call last):
  File "/usr/local/bin/certbot", line 11, in <module>
    load_entry_point('certbot', 'console_scripts', 'certbot')()
  File "/opt/certbot/src/certbot/main.py", line 861, in main
    return config.func(config, plugins)
  File "/opt/certbot/src/certbot/main.py", line 786, in certonly
    lineage = _get_and_save_cert(le_client, config, domains, certname, lineage)
  File "/opt/certbot/src/certbot/main.py", line 85, in _get_and_save_cert
    lineage = le_client.obtain_and_enroll_certificate(domains, certname)
  File "/opt/certbot/src/certbot/client.py", line 357, in obtain_and_enroll_certificate
    certr, chain, key, _ = self.obtain_certificate(domains)
  File "/opt/certbot/src/certbot/client.py", line 318, in obtain_certificate
    self.config.allow_subset_of_names)
  File "/opt/certbot/src/certbot/auth_handler.py", line 66, in get_authorizations
    self.authzr[domain] = self.acme.request_domain_challenges(domain)
  File "/opt/certbot/src/acme/acme/client.py", line 213, in request_domain_challenges
    typ=messages.IDENTIFIER_FQDN, value=domain), new_authzr_uri)
  File "/opt/certbot/src/acme/acme/client.py", line 192, in request_challenges
    response = self.net.post(self.directory.new_authz, new_authz)
  File "/opt/certbot/src/acme/acme/client.py", line 709, in post
    return self._post_once(*args, **kwargs)
  File "/opt/certbot/src/acme/acme/client.py", line 722, in _post_once
    return self._check_response(response, content_type=content_type)
  File "/opt/certbot/src/acme/acme/client.py", line 583, in _check_response
    raise messages.Error.from_json(jobj)
Error: urn:acme:error:rateLimited :: There were too many requests of a given type :: Error creating new authz :: too many failed authorizations recently: see https://letsencrypt.org/docs/rate-limits/
alexanderkjeldaas commented 6 years ago

Related #24

n1b0r commented 6 years ago

it seems that you are hitting LE rates limits. Did you test your setup against staging servers first ?

alexanderkjeldaas commented 6 years ago

No, what I'm doing is switching from one production server to another. So my setup is tested against staging servers on another server.

There seems to be two issues that's happening:

  1. DFPL doesn't check that a http challenge works before contacting LE
  2. DFPL doesn't react to rate limiting signals.

On Wed, Jun 27, 2018 at 6:30 PM, Robin notifications@github.com wrote:

it seems that you are hitting LE rates limits. Did you test your setup against staging servers first ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/n1b0r/docker-flow-proxy-letsencrypt/issues/25#issuecomment-400742922, or mute the thread https://github.com/notifications/unsubscribe-auth/AAUtqYpCThn-ClOw3Jw4SeeJ5rMNpepVks5uA7MvgaJpZM4SWSBu .

djbingham commented 6 years ago

@n1b0r Any update on this? I've seen the combined cert not found error several times, most recently last night when I had a certificate expire for the first time and DFPLE failed to renew it then hit the Let's Encrypt rate limit.

This is a really urgent issue for me as I now have clients complaining of security errors and I don't know how to get my certificate renewed.

vboufleur commented 6 years ago

I'm having a similar error. DFPLE fails everytime with the Error while generating certs for [DOMAIN] error. Logs:

2018-07-11 18:23:07,892;ERROR;Certbot return code: 1. Skipping
2018-07-11 18:23:07,892;ERROR;Error while generating certs for [DOMAIN]
2018-07-11 18:23:07,892;ERROR;Combined certificate not found. Check logs for errors.

Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1997, in __call__
    return self.wsgi_app(environ, start_response)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1985, in wsgi_ap
    response = self.handle_exception(e)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1540, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1982, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1614, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1517, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1612, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1598, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/app/app.py", line 81, in reconfigure
    client.process(args['letsencrypt.host'].split(','), args['letsencrypt.email'], testing=testing)
  File "/app/client_dfple.py", line 184, in process
    raise Exception('Combined cert not found')

Exception: Combined cert not found

If I try to access the DOMAIN url DFP fails with a 503 error No server is available to handle this request. When I restart the DFP service Error while generating certs for keeps happening in DPFLE but the service loads succesfully and on HTTPS.

I'm available to help debug this, just ping me if you want a hand.

vboufleur commented 6 years ago

Ok, new info. After cleaning my docker host of containers, images and volumes accessing the URL of a new service worked correctly with letsencrypt. The service domain begins with vboufleur.*.

I tried to create a new service with a URL beginning with vboufleur_2.* and it failed with the error I described in the comment above. I think it failed because it has a similar domain name to the first already created service.

I tried to create a new service with a domain starting with test.* (completely different from the first one) and it worked too.

christianmscott commented 5 years ago

Bump on this. This is still a problem. When a service and its keys are removed, and the service is brought back again later, the cert request process fails.

Olvikolvi commented 5 years ago

This is huge problem. If i need to re-deploy dfple it always runs LE error: "There were too many requests of a given type". And I only got less than 10 domains.