joohoi / acme-dns

Limited DNS server with RESTful HTTP API to handle ACME DNS challenges easily and securely.
MIT License
2.18k stars 233 forks source link

HA Configuration #262

Open jwomackgsa opened 3 years ago

jwomackgsa commented 3 years ago

Has anyone run acme-dns in a highly available config using the postgres DB? Before I go testing myself, I was just wondering if anyone had multiple instances of acme-dns running against the same PG db without issues?

laingsc commented 3 years ago

Yup, just did this and seems to work just fine.

JonathanATyler commented 3 years ago

Same here. Running two instances using a postgresql cluster backend with reverse proxy in front for http load-balancing. Authenticate/Update using one domain (acme.example.com) and serve dns records using acme-dns.example.com. NS records point to the each server. Works great!

records = [
    # specify that each server will resolve any *.acme-dns.example.com records
    "acme-dns.example.com. NS acme-1.example.com.",
    "acme-dns.example.com. NS acme-2.example.com.",
]

Note that I removed the A record from the example config as I'm using a separate name pointing at the WebProxy for that. The WebProxy requires authentication for registration requests. Had to tweak the python script for certbot a little but it wasn't too bad.

ZPascal commented 2 years ago

@JonathanATyler Unfortunately, when creating my HA setup, I have the problem that each instance acts individually and the initial acme record is set in all instances individually and it comes to an error when reading the data. Could you please share your configuration?

p3l1 commented 2 years ago

@ZPascal I can support your observations. The acme-dns service seems to load all txt records in the database when it is first started, but does not add new ones, which were added by another instance while the first instance is still running.

After the first instance is restarted, both instances are serving the same records again.

@JonathanATyler any chance of sharing your configuration with us?

JonathanATyler commented 2 years ago

Hi @ZPascal, @p3l1

Sorry for the delay. Below is my config, I have not experienced that issue myself thus far, but I haven't thoroughly tested it either as I haven't had any issue with getting my certs. I'm not sure exactly which version I'm using either, probably whatever was available in Sept.

As a side note, I'm actually thinking of setting up a few more instances, just to do the http/api side behind my DMZ (where it's safer), with DNS only on DMZ side. So that might give me some more insight with regards to the issues you're seeing, given that the DNS side won't actually be updating records directly. I will also have a look at postgresql logs to see if queries are actually going through, when I have time.

[general]
listen = "0.0.0.0:53"
protocol = "both"
domain = "acme-dns.example.com"
nsname = "acme-dns.example.com"
nsadmin = "hostmaster@example.com"
records = [
    "acme-dns.example.com. NS acme1.example.com.",
    "acme-dns.example.com. NS acme2.example.com.",
]
debug = false

[database]
engine = "postgres"
connection = "postgres://acme-dns:p@$$word@<postgres-server>/acme-dns"

[api]
ip = "0.0.0.0"
disable_registration = false
port = "8080"
tls = "none"
acme_cache_dir = "api-certs"
corsorigins = [
    "*"
]
use_header = true
header_name = "X-Forwarded-For"

[logconfig]
loglevel = "debug"
logtype = "stdout"
logformat = "text"
p3l1 commented 2 years ago

@JonathanATyler Are you using TLS in production? When using two different acme-dns Server the automatic certificate creation is not working correctly, because the challenge may be answered by the wrong server in my current setup.

Any ideas on how to solve this issue?

I am using the following DNS Configuration:

domain = "acme.customer.example.org"
nsname = "acme.customer.example.org"

records = [
    "acme01.customer.example.org. A 1.1.1.1",
    "acme02.customer.example.org. A 2.2.2.2",
    "acme.customer.example.org. A 1.1.1.1",
    "acme.customer.example.org. A 2.2.2.2",
    "acme.customer.example.org. NS acme01.customer.example.org",
    "acme.customer.example.org. NS acme02.customer.example.org",
]
JonathanATyler commented 2 years ago

@p3l1 I too had trouble getting auto-cert to work in that regard. This is all in a HomeLab at the moment, so I don't really worry about https internally. I use a reverse proxy to handle TLS of the web traffic, and forward http to the ACME-DNS servers on port 8080 (no TLS). Theoretically you can try to request a cert through the proxy for acme,acme01,acme02 and push it to the ACME-DNS servers - to the path set in config (below). I may revisit this when I have time to see if I can do all this without a proxy as it would remove the need for additional auth tweaking needed when requesting certs, but that won't be for a while.

tls = "cert"
tls_cert_privkey = "/etc/acme-dns/privkey.pem"
tls_cert_fullchain = "/etc/acme-dns/fullchain.pem"

All that said, I'm not really actively using this anymore anyway as all my domains are hosted at Linode and I use their ACME-DNS API for most of my requests now so I don't have to manually update DNS records on parent domains.

p3l1 commented 2 years ago

@JonathanATyler Alright, i am going to ditch the second instance for now. Due to the fact the system is not affecting the acme-dns clients directly, there shouldn't be problem when the service is offline for a few hours. As long as the database is stored savely and a recovery can be made quickly.

I will pick up on the idea to get a certificate by using the reverse proxy though.

Thanks for your support :)

ZPascal commented 2 years ago

Hi @p3l1 , I've successfully set up a HA based setup of the ACME DNS server. I created a graphic, to describe my corresponding setup.

Basic setup:

ACME-DNS-Server

ACME Configuration:

[general]
# DNS interface. Note that systemd-resolved may reserve port 53 on 127.0.0.53
# In this case acme-dns will error out and you will need to define the listening interface
# for example: listen = "127.0.0.1:53"
listen = "0.0.0.0:53"
# protocol, "both", "both4", "both6", "udp", "udp4", "udp6" or "tcp", "tcp4", "tcp6"
protocol = "both4"
# domain name to serve the requests off of
domain = "test.com"
# zone name server
nsname = "dns1.test.com,dns2.test.com,dns3.test.com"
# admin email address, where @ is substituted with .
nsadmin = "webmaster.test.com"
# predefined records served in addition to the TXT
records = [
    "test.com. A X.X.X.X",
    "acme.test.com. A X.X.X.X",

    "test.com. NS dns1.test.com.",
    "test.com. NS dns2.test.com.",
    "test.com. NS dns3.test.com.",
]
# debug messages from CORS etc
debug = false

[database]
# Database engine to use, sqlite3 or postgres
engine = "postgres"
connection = "postgres://test:test@X.X.X.X:5432/acme"

[api]
# listen ip eg. 127.0.0.1
ip = "144.91.86.56"
# disable registration endpoint
disable_registration = false
# listen port, eg. 443 for default HTTPS
port = "8443"
# possible values: "letsencrypt", "letsencryptstaging", "cert", "none"
tls = "cert"
tls_cert_privkey = "/home/acme-dns/certs/server.key"
tls_cert_fullchain = "/home/acme-dns/certs/server.crt"
# only used if tls = "letsencrypt"
#acme_cache_dir = "api-certs"
# optional e-mail address to which Let's Encrypt will send expiration notices for the API's cert
notification_email = "webmaster@test.com"
# CORS AllowOrigins, wildcards can be used
corsorigins = []
# use HTTP header to get the client ip
use_header = true
# header name to pull the ip address / list of ip addresses from
header_name = "X-Forwarded-For"

[logconfig]
loglevel = "error"
logtype = "stdout"
logformat = "text"

Apache load balancer Configuration:

<VirtualHost *:80>
        ServerName acme.test.com
        ServerAdmin root@localhost
        DocumentRoot /var/www/html

        <Proxy balancer://cluster>
                BalancerMember https://X.X.X.X:8443
                BalancerMember https://Y.Y.Y.Y:8443
                BalancerMember https://Z.Z.Z.Z:8443
                ProxySet lbmethod=byrequests
        </Proxy>

        SSLProxyEngine on
        SSLProxyCACertificateFile /home/acme-dns/certs/ca.crt
        SSLProxyCheckPeerCN off

        <Location "/">
                deny from all
                allow from X.X.X.X
                allow from Y.Y.Y.Y
                allow from Z.Z.Z.Z

                AuthType Basic
                AuthName "ACME protection"
                AuthUserFile /usr/test/acme/.htpasswd
                require valid-user
        </Location>

        ProxyPass / balancer://cluster/
        ProxyPassReverse / balancer://cluster/

        ErrorLog /var/log/apache2/acmetest.log
        LogLevel warn
        CustomLog /var/log/apache2/acmetest.log combined
        ServerSignature Off
</VirtualHost>

I hope that helps and solves your problem. Feel free to contact me, if you need further details.

JonathanATyler commented 2 years ago

@ZPascal nice setup, glad you were able to sort it out, and thanks for sharing it :) When you say "Shared glusterfs storage folder between the ACME DNS instances to share the certs" is that just for the Self-Signed cert, or acme-dns data. If it's for acme-dns data (so they are all in sync when requesting certs) what path are you "glustering"?

Cheers!

ZPascal commented 2 years ago

Hi @JonathanATyler Thx :)

What do mean with acme-dns data, the configuration file? I shared the complete /home/acme-dns folder as glusterfs volume and the configuration of the acme instances is outside the folder.

JonathanATyler commented 2 years ago

@ZPascal In a previous message it was said "The acme-dns service seems to load all txt records in the database when it is first started, but does not add new ones, which were added by another instance while the first instance is still running". If the data is being stored on local disk first I wondered if your setup accounted for that by using glusterfs. Though it's more likely being stored in memory first and flushed to DB, but not read back from it unless restarted. Have you found that to still be the case? Or were you able to get all of them to resolve the same data across the cluster?

p3l1 commented 2 years ago

@ZPascal What kind of ACME Client are you using for Basic Authentification? Does acme.sh have support for it?

ZPascal commented 2 years ago

@p3l1 I've used the python implementation and modified it. I've opened a gist to share my modifications. I think you mean the acme.sh script? With some modifications, it should be possible to include the functionality inside the script.

@JonathanATyler I will check and test that, and I'll post an answer in few days.

p3l1 commented 2 years ago

@ZPascal This is great! Thanks for sharing your implementation with us 😄

p3l1 commented 2 years ago

I am planning to add Basic Authentification to the official certbot-dns-acmedns plugin, so it can be used directly inside nginx-proxy-manager, which already has the current version of the acme-dns plugin implemented.

https://github.com/pan-net-security/certbot-dns-acmedns/issues/2

ZPascal commented 2 years ago

@JonathanATyler Sorry for the late reply. I could not see any problems with resolving the entries. In my test case, I created 3 TXT entries in parallel and these were written to the database and the ACME client could continue without issues. If you have further questions, feel free to contact me!

JonathanATyler commented 2 years ago

@ZPascal No worries, that was my experience as well. Thanks for confirming.