PalmStoneGames / kube-cert-manager

Manage Lets Encrypt certificates for a Kubernetes cluster.
Apache License 2.0
540 stars 102 forks source link

support for multiple domains per certificate #37

Closed jonnolen closed 7 years ago

jonnolen commented 7 years ago

On GCE we're creating an ingress of the form:

kind: Ingress
metadata:
  name: foo-bar-lb  
  annotations:
    stable.k8s.psg.io/kcm.enabled: 'true'
    stable.k8s.psg.io/kcm.provider: googlecloud
    stable.k8s.psg.io/kcm.email: admin@foo-bar.com
apiVersion: extensions/v1beta1
spec:
  tls:
  - hosts:
    - www.foo-bar.com
    - api.foo-bar.com
    secretName: "*.foo-bar.com"
  rules:
  - host: www.foo-bar.com
    http:
      paths:
      - backend:
          serviceName: frontend
          servicePort: 80
  - host: api.foo-bar.com
    http:
      paths:
      - backend:
          serviceName: backend
          servicePort: 80

This creates:

18m        18m         1         foo-bar-lb               Ingress   spec.tls[0]   Warning   ACMEMultipleHosts   {kube-cert-manager }   Couldn't create ACME certificate for secret *.foo-bar.com: don't support multiple hosts per secret

Multiple hosts per cert is supported by LetsEncrypt... any opposition to a change to allow this?

Bonus: could these failure events be logged to the console so that it's clear that an ingress event was processed and ignored... took a long time to track down k8s events.

jonnolen commented 7 years ago

just thinking out loud... why not add data.tls.hosts to the secret that stores the list of hosts? It doesn't appear to search for the secret by labels.domain. the secret is checked for s.Labels['domain'] but parsing out s.Data['tls.hosts'] could work just as well to get to the point where the data is pulled from bolt.

luna-duclos commented 7 years ago

/cc @dominikh I recall there were some issues with this when we discussed this, do you remember what they were ?

dominikh commented 7 years ago

@luna-duclos No inherent issues, just added complexities and not wanting to deal with them at the time because it wasn't a particularly important feature.

Some parts of the code, garbage collection in particular, assume one domain per certificate.

luna-duclos commented 7 years ago

In that case, no opposition from my end to add support for this to kcm

whereisaaron commented 7 years ago

@jonnolen you can support multiple domains in an Ingress just by having a secret (and certificate) for each one, e.g. change your example as below and it will work fine.

...
  tls:
  - hosts:
    - www.foo-bar.com
    secretName: "www.foo-bar.com"
  - hosts:
    - api.foo-bar.com
    secretName: "api.foo-bar.com"
...

When you think about it, having multiple domain names per certificate is out of date. You needed that back when you didn't have SNI ('Host Header' for TLS), or when you had to manually manage certificates and did want lots of them.

Is it appropriate that a certificate with 'api' in it used for your 'www' website and vice versa? Isn't is cleaner/better/cooler to have a certificate for the appropriate domain for each end point? Well not always, you might like aliases for the same service like 'example.com' and 'www.example.com' to be in the same certificate, so I'm wrong there 😏

I though briefly about adding this feature, but it is more complex than the 1:1 relationship kube-cert-manager has right now. I am sure a clean implementation approach would be welcomed. In the meantime it is very easy to work around as above.

jonnolen commented 7 years ago

@whereisaaron we went down that path initially, we are on GCE and what happens is only the first cert gets attached to the generated HTTP Load Balancer. I couldn't find a way to attach the second cert to the load balancer, if it's possible I'd love to know how because that would definitely solve our problem.

whereisaaron commented 7 years ago

Hmm @jonnolen ... sounds like a limitation of the GCE Ingress Controller.

Could you not just make two Ingresses, one per domain/cert/backend?

Or for complete control, you run your own Ingress Controller (like the nginx one or Traefik) and use a GCE TCP load balancer?

jonnolen commented 7 years ago

We are currently doing 1 per host, ill investigate rolling our own. On Sun, Jan 22, 2017 at 11:44 AM Aaron Roydhouse notifications@github.com wrote:

Hmm @jonnolen https://github.com/jonnolen ... sounds like a limitation of the GCE Ingress Controller.

Could you not just make two Ingresses, one per domain/cert/backend?

Or for complete control, you run your own Ingress Controller (like the nginx one or Treafik) and use a GCE TCP load balancer?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/PalmStoneGames/kube-cert-manager/issues/37#issuecomment-274342143, or mute the thread https://github.com/notifications/unsubscribe-auth/AAlsvs323o7wBxbjpp_B_k4sok4MxH9Zks5rU4d8gaJpZM4Lo35F .

ean commented 7 years ago

I have added support for multiple domains, you can find the code here: https://github.com/ean/kube-cert-manager/tree/support-multiple-certificates The branch also contains the default provider and email code from @whereisaaron , so I will create a PR for it when his branch is merged.

whereisaaron commented 7 years ago

Looks great @ean - excellent feature. It is particularly nice that with this we can cut down on the number of Certificate/Secret resources required and can use either way to express Ingresses.

My PR is just waiting on me to update the documentation. That'll honestly be sometime next week at this stage sorry.

tamalsaha commented 7 years ago

Hi, I came across this issue while reading https://blog.nelhage.com/post/kubernetes/ . We have build our own Kubernetes ingress controller that can also issue cert from LE. You can find our implementation here: https://github.com/appscode/voyager/blob/master/docs/user-guide/README.md#certificate

rishka commented 7 years ago

@ean, using your branch with many domains will hit the rate limit and never get the certificate

jonnolen commented 7 years ago

what is "many domains" 5? 10? 100?

On Fri, Feb 24, 2017 at 12:32 PM, Adam Perlman notifications@github.com wrote:

@ean https://github.com/ean, using your branch with many domains will hit the rate limit and never get the certificate

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/PalmStoneGames/kube-cert-manager/issues/37#issuecomment-282352484, or mute the thread https://github.com/notifications/unsubscribe-auth/AAlsvlGASKdIqPC9qkHCJolc1vWizYo5ks5rfxQhgaJpZM4Lo35F .

rishka commented 7 years ago

looks like ~5-10 when using http verification

whereisaaron commented 7 years ago

However you use Let's Encrypt, the rate limit is 20 new certificates per registered domain per week (calculated daily). Certificates with multiple SAN names (up to 100) count as one certificate. So @ean's branch allows more sub-domains to be registered per week than before.

The main limit is Certificates per Registered Domain (20 per week). A registered domain is, generally speaking, the part of the domain you purchased from your domain name registrar. For instance, in the name www.example.com, the registered domain is example.com. In new.blog.example.co.uk, the registered domain is example.co.uk. We use the Public Suffix List to calculate the registered domain.

If you have a lot of subdomains, you may want to combine them into a single certificate, up to a limit of 100 Names per Certificate. Combined with the above limit, that means you can issue certificates containing up to 2,000 unique subdomains per week. A certificate with multiple names is often called a SAN certificate, or sometimes a UCC certificate.

We also have a Duplicate Certificate limit of 5 certificates per week. A certificate is considered a duplicate of an earlier certificate if they contain the exact same set of hostnames, ignoring capitalization and ordering of hostnames. For instance, if you requested a certificate for the names [www.example.com, example.com], you could request four more certificates for [www.example.com, example.com] during the week. If you changed the set of names by adding [blog.example.com], you would be able to request additional certificates.

rishka commented 7 years ago

Sorry, should have been more clear. I am requesting a single certificate with ~15 SAN. But when verifying via http, I get a 429 error after it verifies the 4th or 5th SAN.

whereisaaron commented 7 years ago

Are your validations actually working? If they are failing you'll get blocked after 5 failures. E.g. if the Let's Encrypt HTTP requests are not being router to kcm, the validation will fail, and you get rate-limit blocked. Suggest you test on Let's Encrypt staging.

We will soon (February 2017) introduce a Failed Validation limit of 5 failures per account, per hostname, per hour. This limit will be higher on staging so you can use staging to debug connectivity problems.

rishka commented 7 years ago

All the validations pass if the certificate request has 4 or 5 SAN. I made 4 certificate k8s objects that had all the domains between them and the process worked. However one certificate with ~15-20 SAN does not work. I'll sanitize some logs when I am back at my computer and post them

rishka commented 7 years ago

Certificate YAML:

apiVersion: "stable.k8s.psg.io/v1"
kind: "Certificate"
metadata:
  name: "example-com"
spec:
  domain: "example.com"
  email: "certs@example.com"
  provider: "http"
  secretName: example-com
  altNames:
    - "pg.example.com"
    - "staging.example.com"
    - "a.example.com"
    - "ex1.com"
    - "ex2.com"
    - "b.example.com"
    - "ex3.com"
    - "ex4.com"
    - "ex5.com"
    - "ex6.com"
    - "ex7.com"
    - "ex8.com"
    - "ex9.com"
    - "c.example.com"
    - "d.example.com"
    - "e.example.com"

cert-manager output:

2017/02/24 20:51:21 [INFO] acme: Registering account for certs@example.com
2017/02/24 20:51:23 [INFO][example.com, ex1.com, d.example.com, ex2.com, b.example.com, ex3.com, ex6.com, ex7.com, e.example.com, c.example.com, ex5.com, ex4.com, pg.example.com, a.example.com, staging.example.com, ex8.com, ex9.com] acme: Obtaining bundled SAN certificate
2017/02/24 20:51:24 [INFO][example.com] AuthURL: https://acme-staging.api.letsencrypt.org/acme/authz/sF4QY2GrjkL3peLPhNKz00F73k9YwQcmF-XF1nkti38
2017/02/24 20:51:24 [INFO][ex1.com] AuthURL: https://acme-staging.api.letsencrypt.org/acme/authz/oJEfmNglpp_x6zeKcx6uxC4XdKJ1EEntIJYXKCWWHd
2017/02/24 20:51:24 [INFO][b.example.com] AuthURL: https://acme-staging.api.letsencrypt.org/acme/authz/PCHQIcfHqi_bqinPqcfPU28A2PzsGz-eCDnX8MSSfY
2017/02/24 20:51:24 [INFO][ex3.com] AuthURL: https://acme-staging.api.letsencrypt.org/acme/authz/7nKV5BG8fMLTHgBX7ECpcWbxPqmUpyPQ4fzoPdnYzF
2017/02/24 20:51:24 [INFO][ex6.com] AuthURL: https://acme-staging.api.letsencrypt.org/acme/authz/zXdounzpnaSXHZOqEvLh-KLA8vED3frmVwBi9cFbwy
2017/02/24 20:51:24 [INFO][ex7.com] AuthURL: https://acme-staging.api.letsencrypt.org/acme/authz/C5lhuEIDopjdhobXTtbxm332A1al5WE66Bk9Wx78jT
2017/02/24 20:51:24 [INFO][e.example.com] AuthURL: https://acme-staging.api.letsencrypt.org/acme/authz/t3F1y_7qsnR0O9y6jP4kPzzn7KwgiekFWv1zsamWC9
2017/02/24 20:51:24 [INFO][ex5.com] AuthURL: https://acme-staging.api.letsencrypt.org/acme/authz/LeJ8mjHJiwmfJfE_D5kDPOA7vrzkHuw3jik8AecRlr
2017/02/24 20:51:24 [INFO][ex4.com] AuthURL: https://acme-staging.api.letsencrypt.org/acme/authz/1lKZcZJ3UMb3avNVUaONGH0f3Sg5vuJeRBvg4yNDkd
2017/02/24 20:51:24 [INFO][a.example.com] AuthURL: https://acme-staging.api.letsencrypt.org/acme/authz/N23Nf1UtJcJO8YWLqdEExbrnTMubB8kZum2Uy4-8hp
2017/02/24 20:51:24 [INFO][ex8.com] AuthURL: https://acme-staging.api.letsencrypt.org/acme/authz/3DVIKc4579rtYqu2Fp9AZ6EwT6q-Pk24RnaJZX8AYl
2017/02/24 20:51:24 Error while processing certificate event: Error while obtaining certificate for new domain d.example.com: acme: Error 429 -  -
ean commented 7 years ago

@Rishka could you try to use the lego command line client(https://github.com/xenolf/lego) to do the same? As far as I can see we use the SAN support in the exact same was as the cli, if there is a bug it might be in the lego library.

rishka commented 7 years ago

You are correct, it looks like a bug there. I get the same output from the cli. Opened a ticket with them(https://github.com/xenolf/lego/issues/356)

luna-duclos commented 7 years ago

Solved by #46