coredns / coredns

CoreDNS is a DNS server that chains plugins
https://coredns.io
Apache License 2.0
12.01k stars 2.08k forks source link

https: automatic tls certificate (or in plugin/tls) #3460

Open zzxap opened 4 years ago

zzxap commented 4 years ago

start a https server https://www.mydomain.com { bind myip hosts { 10.6.6.2 sms.service 10.6.6.3 search.service } } how to set https certificate?

miekg commented 4 years ago

[ Quoting notifications@github.com in "[coredns/coredns] start a https ser..." ]

start a https server https://www.mydomain.com { bind myip hosts { 10.6.6.2 sms.service 10.6.6.3 search.service } } how to set https certificate?

this has to be done manually. I also think Let's Encrypt; or generally the CA industry doesn't handout certs for IP addresses; so you need some kind of domain as well.

I'm eyeing caddy 2 ways of setting and getting certs as this has been automated already (cc @mholt)

mholt commented 4 years ago

Hi - anything I can help with?

miekg commented 4 years ago

[ Quoting notifications@github.com in "Re: [coredns/coredns] start a https..." ]

Hi - anything I can help with?

wondering if caddy's acme code can be tranferred/used from coredns to perform a simular function, but now for a DNS server.

mholt commented 4 years ago

Yep, totally.

In Caddy 2, there is the tls app which can manage the certs for you.

Or if you just need cert management features in any other Go program, you can use CertMagic: https://github.com/mholt/certmagic -- this is the same library that Caddy uses. (Caddy 2 uses it also, but Caddy 2 has the tls app which is a more centralized way to do it.)

miekg commented 4 years ago

[ Quoting notifications@github.com in "Re: [coredns/coredns] start a https..." ]

Yep, totally.

In Caddy 2, there is the tls app which can manage the certs for you.

Or if you just need cert management features in any other Go program, you can use CertMagic: https://github.com/mholt/certmagic -- this is the same library that Caddy uses.

thanks! Would be good to see what's easiest to integrate (at some point), but that's homework we need to do.

/Miek

-- Miek Gieben

mholt commented 4 years ago

No problem, let me know if you want any more specific guidance or have any questions!

polarathene commented 3 years ago

It would be nice to see CoreDNS able to handle an ACME DNS challenge.

Not sure how much work that is to implement, and I assume a CoreDNS plugin would be required in additional to a separate one for software like Caddy to use (caddy-dns plugins which are apparently simple to implement using Go libdns, using with Caddy guide here). go-acme/lego has quite a list of supported providers, but most are not providers that you can self-host (I recognize PowerDNS as one of the only ones there?).

miekg commented 3 years ago

I agree tls shoudl do acme

tls {
    acme <domainname>
}

or something as config and then it does the let's encrypt dance. Setting up those keys/certs manually is tedious. We do need a place to write the keys/certs though...

mholt commented 3 years ago

@polarathene

It would be nice to see CoreDNS able to handle an ACME DNS challenge.

What do you mean by this, exactly? Technically "handling" the DNS challenge is just a matter of setting a TXT record, which CoreDNS is more than capable of already.

miekg commented 3 years ago

[ Quoting notifications@github.com in "Re: [coredns/coredns] https: automa..." ]

@polarathene

It would be nice to see CoreDNS able to handle an ACME DNS challenge.

What do you mean by this, exactly? Technically "handling" the DNS challenge is just a matter of setting a TXT record, which CoreDNS is more than capable of already.

ah ok. It's because with DoH - which is HTTP/2 - we need a full blown "get cert from let's encrypt" (in a way that by-passes DNS, if it all possible)

polarathene commented 3 years ago

I'm not sure about Lets Encrypt, but I have used acmebot(python ACME client) with bindtool which it calls to update my DNS records on CoreDNS via a zone file template, one part is an {{acme}} section that it reads JSON from acmebot to get the domain name and TXT record to generate a record for.

The only issue with that setup is the file plugin can't update by file watch event, has to wait until the reload timer has elapsed (which checks for SOA record update to know if it should update records).

acmebot also supports another way to update DNS records that I imagine most other DNS services are also doing with their DNS plugins for various ACME clients, "RFC 2136 dynamic DNS updates using nsupdate". I know @miekg has spoken of wanting to implement that before, perhaps a plugin for temporal records like ACME TXT DNS challenge is a good case?

polarathene commented 3 years ago

BTW, if it would be helpful, I presently have a docker-compose setup for my current approach. It uses

I just need to bring the containers up and it automates the whole provisioning/setup. I used it for testing a TLS security update for a PR I'm contributing to said mail server project.

polarathene commented 3 years ago

ah ok. It's because with DoH - which is HTTP/2 - we need a full blown "get cert from let's encrypt" (in a way that by-passes DNS, if it all possible)

Oh.. was this about TLS certs for CoreDNS to use for it's own needs? My mistake :sweat_smile:

Still if you go forward with it, ideally it's not tied to LetsEncrypt like some ACME clients are. Smallstep is a nice private CA / ACME provisioner server, it'd be great if that can also be used for getting certs from over ACME.acmebot handles this really nicely config wise btw. It could be used to provision a TLS cert for any domain that CoreDNS needs, just need to configure CoreDNS to use the certificate.

miekg commented 3 years ago

[ Quoting notifications@github.com in "Re: [coredns/coredns] https: automa..." ]

ah ok. It's because with DoH - which is HTTP/2 - we need a full blown "get cert from let's encrypt" (in a way that by-passes DNS, if it all possible)

Oh.. was this about TLS certs for CoreDNS to use for it's own needs? My mistake 😅

Still if you go forward with it, ideally it's not tied to LetsEncrypt like some ACME clients are. Smallstep is a nice private CA / ACME provisioner server, it'd be great if that can also be used for getting certs from over ACME.

yep, initially this would be for let's encrypt, but others can be added. Getting a special TXT record served by a plugin is probably the easiest thing to implement.

balboah commented 3 years ago

Hey guys. Now I'm having an issue with letsencrypt as well. In my setup, I'm using Kubernetes and cert-manager from jetstack. This works great, it can generate the certificate via DNS api's on my provider. The certificate will be mounted for coredns to use, I just provide tls /etc/coredns-tls/tls.crt /etc/coredns-tls/tls.key.

However when the certificate gets cycled every 90 days, the file will change but coredns won't notice. I was thinking it would be enough with a reload option in the tls plugin. As many parts may use the cert, I believe an Instance restart similar to how the reload plugin works would be the most straight way to do it?

miekg commented 3 years ago

[ Quoting notifications@github.com in "Re: [coredns/coredns] https: automa..." ]

Hey guys. Now I'm having an issue with letsencrypt as well. In my setup, I'm using Kubernetes and cert-manager from jetstack. This works great, it can generate the certificate via DNS api's on my provider. The certificate will be mounted for coredns to use, I just provide tls /etc/ coredns-tls/tls.crt /etc/coredns-tls/tls.key.

However when the certificate gets cycled every 90 days, the file will change but coredns won't notice. I was thinking it would be enough with a reload option in the tls plugin. As many parts may use the cert, I believe an Instance restart similar to how the reload plugin works would be the most straight way to do it?

I thinks a better to keep the reloading contained in the reload plugin. What could be is to add an option there to reload every X hours + jitter, so any changes that live outside of the main config file are picked up.

balboah commented 3 years ago

Having a reload ever X hours would work, but it would have to restart a lot more often than the TLS cert file changes since it can't know the difference between the certificate expiry and the start time of coredns. The reload plugin already refers to auto to be able to reload zonefiles, that's why I thought it would be natural that tls should handle reloading of cert files. And then only one reload would be necessary.

miekg commented 3 years ago

[ Quoting notifications@github.com in "Re: [coredns/coredns] https: automa..." ]

Having a reload ever X hours would work, but it would have to restart a lot more often than the TLS cert file changes since it can't know the difference

so?

between the certificate expiry and the start time of coredns. The reload plugin already refers to auto to be able to reload zonefiles, that's why I thought it would be natural that tls should handle reloading of cert files.

yeah, but I'm not a fan of every plugin doing this by itself

balboah commented 3 years ago

Doesn't this create unnecessary interruptions to client requests? You would prefer to have one global option that restarts for example every hour instead of only when Corefile, tls, or zonefiles change? Or that one reload loop should run last in line and handle the checking of all the external resources of other plugins? I saw you also need something similar with #4406

miekg commented 3 years ago

shouldn't be an issue. Think it is better to focus on having builtin acme support (or whatever the protocol is)

balboah commented 3 years ago

Sure that would be convenient, but more complicated. And it wouldn't work when you have more than one instance. I would somehow need to share persistent storage between my coredns instances or move this stuff to another proxy a.k.a. ingress in front of coredns. This can already be solved with cert-manager. Just need to reload :)

balboah commented 3 years ago

But i guess this can be done from an external plugin as well.

miekg commented 3 years ago

[ Quoting notifications@github.com in "Re: [coredns/coredns] https: automa..." ]

Sure that would be convenient, but more complicated. And it wouldn't work when you have more than one instance. I would somehow need to share persistent storage between my coredns instances or move this stuff to another proxy a.k.a. ingress in front of coredns. This can already be solved with cert-manager. Just need to reload :)

fair point, but I still think reloading via reload is simplest to add and neatly contains this functionality. (Could have done this for the host and (maybe) auto plugin as well)

yongtang commented 3 years ago

I think builtin acme support would be a nice feature.

Depending on the environment the share persistent storage could be easily available. For example NFS is available in may IT environment (on cloud such as AWS you have EFS). I think this is less of a concern and not necessarily tied to the acme support itself.

miekg commented 3 years ago

Caddy did stuff to support clusters retrieving certs. If that all is too complex we could still fallback to just reloading the certs (I mean that would be simplest to implement, with only more and more plugins doing so, but we could provide some infra for that)

On Fri, 22 Jan 2021, 19:49 Yong Tang, notifications@github.com wrote:

I think builtin acme support would be a nice feature.

Depending on the environment the share persistent storage could be easily available. For example NFS is available in may IT environment (on cloud such as AWS you have EFS). I think this is less of a concern and not necessarily tied to the acme support itself.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/coredns/coredns/issues/3460#issuecomment-765615605, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACWIW3S424U5DBN2VNEVPTS3HCFDANCNFSM4JON3IJQ .

mholt commented 3 years ago

@miekg If you use CertMagic, all the clustering logic is hidden away from you, except for your choice of storage backend (and there are several to choose from).

miekg commented 3 years ago

[ Quoting notifications@github.com in "Re: [coredns/coredns] https: automa..." ]

@miekg If you use CertMagic, all the clustering logic is hidden away from you, except for your choice of storage backend (and there are several to choose from ).

Thanks!

I'm really torn if this would be a good feature to add, or feature creep. I'm slightly leaning towards 'creep', but can't articulate why.

balboah commented 3 years ago

I'm also a bit hesitant if it fits in this project.

I'm running multiple coredns on the same kubernetes cluster with a DNS challenge served by cert-manager. Then coredns just gets the issued tls cert files passed to it. The same cert is also shared with a different service on the same domain but different port.

With DNS challenge, I don't need to listen for incoming requests from letsencrypt. I don't need a shared storage routing mechanism, cert-manager talks with the DNS provider and stores result in kubernetes configmap/secrets (which ofc is shared storage but on one level above coredns).

For my case, it wouldn't make sense to put the DNS configuration into coredns and then try to share that cert with a different service. For this, I lean more towards scope creep.

But to be able to use the HTTP challenge instead, you either need it embedded or have a proxy upstream. For a user who wants a simple one instance with only coredns and the HTTP challenge kind of automatically working, it's very convenient. I guess it depends on what the target user of coredns is

johnbelamaric commented 3 years ago

Could plugins implement a ReloadRequestor interface that contains a single reloadNeeded function? Then, in addition to the Corefile checks it does now, the reload plugin would check with all implementers of that interface that are installed in the plugin chain. We do similar things in several other places.

balboah commented 3 years ago

That’s a great idea. Only one routine needed and possibly less restarts.

As for HTTP challenges, I suppose if we refactor the DoH “is valid request” a little bit further, you could use that for http middleware. Then letsencrypt can be an external plugin

Den Mån 25 jan 2021 kl 19:33 skrev John Belamaric:

Could plugins implement a ReloadRequestor interface that contains a single reloadNeeded function? Then, in addition to the Corefile checks it does now, the reload plugin would check with all implementers of that interface that are installed in the plugin chain. We do similar things in several other places.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/coredns/coredns/issues/3460#issuecomment-767023759, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACH6ITSCWMCL3LCPHAE5OTS3W2PLANCNFSM4JON3IJQ.

balboah commented 3 years ago

Oh wait, I think it also requires a clear text http for the challenge to work. Never understood why

Den Mån 25 jan 2021 kl 19:53 skrev Johnny Bergström:

That’s a great idea. Only one routine needed and possibly less restarts.

As for HTTP challenges, I suppose if we refactor the DoH “is valid request” a little bit further, you could use that for http middleware. Then letsencrypt can be an external plugin

Den Mån 25 jan 2021 kl 19:33 skrev John Belamaric:

Could plugins implement a ReloadRequestor interface that contains a single reloadNeeded function? Then, in addition to the Corefile checks it does now, the reload plugin would check with all implementers of that interface that are installed in the plugin chain. We do similar things in several other places.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/coredns/coredns/issues/3460#issuecomment-767023759, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACH6ITSCWMCL3LCPHAE5OTS3W2PLANCNFSM4JON3IJQ.

miekg commented 3 years ago

[ Quoting notifications@github.com in "Re: [coredns/coredns] https: automa..." ]

Could plugins implement a ReloadRequestor interface that contains a single reloadNeeded function? Then, in addition to the Corefile checks it does now, the reload plugin would check with all implementers of that interface that are installed in the plugin chain. We do similar things in several other places.

that's a good idea. Then let's encrypt as a separate plugin is also nice!

yongtang commented 3 years ago

We have added this issue as a candidate project for Google-Summer-of-Code 2021 (https://github.com/cncf/mentoring/pull/347). The issue will be reserved for GSoC students contributions for the next several months.

jimil749 commented 3 years ago

Hi @yongtang! I'm interested in this project (for GSoC'21). Any pointers to get started? :)

yongtang commented 3 years ago

@jimil749 You will need to submit the proposal to GSoC under CNCF organization.

Before that, you can walk through the discussions within this issue thread, to get a rough idea and think about what needs to be done. You can ask questions here. As you might notice there are lots of interests for this feature to be in CoreDNS so many people will be happy to help.

jimil749 commented 3 years ago

@yongtang I went over the discussions above, so it seems like the ultimate goal for the project is to add/integrate ACME protocol to automate cert management (essentially for DoH), so that users don’t have to worry about certificates, right? Since, there are plenty of ACME clients out there like certmagic, certbot, lego etc. Why not leverage those with coredns to achieve this functionality? Also, is the client just tied to Let’s Encrypt CA?

PS: It’d be great if you can provide a brief summary of what the community is aiming for this project. :sweat_smile:

yongtang commented 3 years ago

@jimil749 ACME needs a DNS challenge and CoreDNS itself is a DNS server (and likely will be the one serving the owned domains for many). That is why it makes sense to automate the process because you shouldn't need to go through any other systems or vendors for certificate management.

miekg commented 3 years ago

[ Quoting notifications@github.com in "Re: [coredns/coredns] https: automa..." ]

@jimil749 ACME needs a DNS challenge and CoreDNS itself is a DNS server (and likely will be the one serving the owned domains for many). That is why it makes sense to automate the process because you shouldn't need to go through any other systems or vendors for certificate management.

One of the first things to check is if Let's Encrypt allows you to use a SAN that you want, but can't proof you own, i.e. quad9 uses dns.quad9.net. Or maybe they allow plain IP addresses in the SAN, don't know.

jimil749 commented 3 years ago

One of the first things to check is if Let's Encrypt allows you to use a SAN that you want, but can't proof you own, i.e. quad9 uses dns.quad9.net. Or maybe they allow plain IP addresses in the SAN, don't know

Can you explain, "Let's Encrypt allows you to use a SAN that you want, but can't proof you own" part? I'm not sure I understood that thoroughly.

yongtang commented 3 years ago

https://letsencrypt.org/docs/faq/#can-i-get-a-certificate-for-multiple-domain-names-san-certificates-or-ucc-certificates

Let's encrypt requires ACME challenge for wildcard certificates explicitly, though it didn't mention if ACME challenge is needed for SAN certificate. @jimil749 you will need to do some investigation to find out.

jimil749 commented 3 years ago

https://blog.ipswitch.com/install-free-lets-encrypt-ssl-san-certificate-for-exchange-2019

Came across this blog post which sets up SSL-SAN certificates via ACME for exchange servers. From the steps shown, it seems like the ACME challenge is required for the SAN certificate.

yongtang commented 3 years ago

@jimil749 That is a good start and shows the need for ACME support in coredns for effort-less deploy/renew of certificate.

jimil749 commented 3 years ago

So, I did a bit of research on DoH and ACME in general and have a few questions, mainly regarding DoT and DoH in CoreDNS.

To put that into perspective: I have a domain for my dns server (let's say dns.example.com) which can serve dns over tls for multiple domains. (let's say example.com and example-2.com). Next, I used lego acme client to get certificates for the domain dns.example.com i.e my dns server, and I have my Corefile with following configuration:

tls://example.com {
    tls dns.example.com.crt dns.example.com.ch.key
    log
    whoami  
}

tls://example-2.com {
       tls dns.example.com.crt dns.example.com.ch.key
    log
    whoami
}

After all of that, running kdig +tls +tls-host=dns.example.com example.com and kdig +tls +tls-host=dns.example.com example-2.com gives a valid output. There wasn't a need for a SAN certificate.

We can certainly automate the getting the certificate using LEGO ACME client part and add that to Coredns, but I don't quite understand the use case of SAN here. :sweat_smile:

Sorry, if that question was a little too vague, but there is lack of documentation around certificate management and DoT/DoH, so this was a bit overwhelming to me initially!

jimil749 commented 3 years ago

cc @miekg @yongtang

yongtang commented 3 years ago

@jimil749 While certificate management may sounds like a trivial issue initially, in true production environment it is more complicated: In sizable environment DNS servers are normally operated by multiple parties and for different purposes. For example, IT have DNS servers for internal non-engineering usage while SRE may have a list of DNS servers for customer-facing cloud-services, testing, etc. You cannot expect one company to only have one DNS server and be done with it once up and running. In certain compliance environments (e.g., FedRAMP/etc), access to different servers are limited. That means the person who knows how to acquire and renew certificates may not even able to touch the deployment servers (be it dns or https). One of the biggest reason for certificate management automation is the renew. Production services break all the time when certificate expires. And when things go wrong people scramble to try to find out the complete list of servers that deploys the certificate. Certificate renew happens yearly. That means in many mid-sized companies the person that handled the last renew may already left the company. Breaks happen way more frequently than people imagine. So the automation is not to "run list of commands", but to avoid run any command, and to achieve scale with sizable number of servers.

jimil749 commented 3 years ago

Thanks for the detailed response @yongtang! This does clear up the use-case of Certificate Management and how important it is in the production environment. Also, since there are multiple DNS Servers running in multiple environments, usage of SAN also seems to make sense.

Talking about implementing acme plugin, (regarding the acme implementation and the acme challenges):

Does that make sense? Is this how the challenges should be handled by acme plugin? PS: Currently I'm going through RFC 8555 to get more deeper insight into the protocol.

jimil749 commented 3 years ago

Another question, @yongtang. Do I need to create a RFC for the project proposal?

miekg commented 3 years ago

[ Quoting @.> in "Re: [coredns/coredns] https: automa..." ] @. While certificate management may sounds like a trivial issue

initially, in true production environment it is more complicated:

  1. In sizable environment DNS servers are normally operated by multiple parties and for different purposes. For example, IT have DNS servers for internal non-engineering usage while SRE may have a list of DNS servers for customer-facing cloud-services, testing, etc. You cannot expect one company to only have one DNS server and be done with it once up and running.
  2. In certain compliance environments (e.g., FedRAMP/etc), access to different servers are limited. That means the person who knows how to acquire and renew certificates may not even able to touch the deployment servers (be it dns or https).
  3. One of the biggest reason for certificate management automation is the renew. Production services break all the time when certificate expires. And when things go wrong people scramble to try to find out the complete list of servers that deploys the certificate. Certificate renew happens yearly. That means in many mid-sized companies the person that handled the last renew may already left the company. Breaks happen way more frequently than people imagine. So the automation is not to "run list of commands", but to avoid run any command, and to achieve scale with sizable number of servers.

Note that caddy already has some code that allows TLS cert sharing, which might be worth a peek.

miekg commented 3 years ago

[ Quoting @.***> in "Re: [coredns/coredns] https: automa..." ]

To put that into perspective: I have a domain for my dns server (let's say dns.example.com) which can serve dns over tls for multiple domains. (let's say example.com and example-2.com). Next, I used lego acme client to get certificates for the domain dns.example.com i.e my dns server, and I have my Corefile with following configuration:

tls://example.com { tls dns.example.com.crt dns.example.com.ch.key log whoami }

tls://example-2.com { tls dns.example.com.crt dns.example.com.ch.key log whoami }

Note this is authoritative only, not a recursive/forwarding server. The DoH spec is only specifiec for client <-> recurser, so while the above will work, it's not the usecase that has an RFCs.

For a recursor to work you def. need a SAN to be able to check the cert, i.e. dns.coredns.io, but then without actually owning that domain. Or maybe you do, which will then allow you to get that cert and use it for forwarding?

jimil749 commented 3 years ago

Note this is authoritative only, not a recursive/forwarding server. The DoH spec is only specifiec for client <-> recurser, so while the above will work, it's not the usecase that has an RFCs.

Ahh okay! Gotcha.

For a recursor to work you def. need a SAN to be able to check the cert, i.e. dns.coredns.io, but then without actually owning that domain. Or maybe you do, which will then allow you to get that cert and use it for forwarding?

But I don't think that Let's Encrypt issues certificates if we do not own the domain. ACME challenges do have that requirement. So, if we are using dns.coredns.ioas SAN in our certificate, we need to prove that we own the domain to get the cert.