getdnsapi / stubby

Stubby is the name given to a mode of using getdns which enables it to act as a local DNS Privacy stub resolver (using DNS-over-TLS).
https://dnsprivacy.org/dns_privacy_daemon_-_stubby/
BSD 3-Clause "New" or "Revised" License
1.19k stars 99 forks source link

DNNSEC not working when stubby run as systemd service. Works fine run stubby run manually #106

Closed eccgecko closed 6 years ago

eccgecko commented 6 years ago

I have a strange issue that when I run the stubby daemon manually, DNSSEC seems to be working ok. For example the command dig @127.0.2.2 -p 5353 www.dnssec-failed.org returns the following:

; <<>> DiG 9.10.3-P4-Raspbian <<>> @127.0.2.2 -p 5353 +dnssec www.dnssec-failed.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 24774 ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;www.dnssec-failed.org. IN A ;; Query time: 129 msec ;; SERVER: 127.0.2.2#5353(127.0.2.2) ;; WHEN: Sat Apr 28 12:22:10 CEST 2018 ;; MSG SIZE rcvd: 39 so dnssec-failed.org doesn't resolve. However, once I quit the manual daemon, and start the systemd stubby.service I have, which starts up ok, I now get a reply from dnssec-failed.org:

; <<>> DiG 9.10.3-P4-Raspbian <<>> @127.0.2.2 -p 5353 www.dnssec-failed.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16532 ;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags: do; udp: 1536 ; OPT=12: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 (".............................................................................................................................................................................................................") ;; QUESTION SECTION: ;www.dnssec-failed.org. IN A ;; ANSWER SECTION: www.dnssec-failed.org. 2325 IN A 68.87.109.242 www.dnssec-failed.org. 2325 IN A 69.252.193.191 www.dnssec-failed.org. 2325 IN RRSIG A 5 3 7200 20180430172414 20180423141914 44973 dnssec-failed.org. w7tdNJ/YrlNO30y2GuPSJ31388GnzrPrHgJw4vQijlsL5LgkTTg5hzJw Ox5Ra2xSjlLdR7JeA4ZXvKF9rzws+8ys+EFJyps0+KejonIELKuLIqEw b9QS4ITc3mii4hFqVOwMtxj7txv6lKngknqbxiFr2nCpyJX0SOo6UXye YsI= ;; Query time: 167 msec ;; SERVER: 127.0.2.2#5353(127.0.2.2) ;; WHEN: Sat Apr 28 12:29:53 CEST 2018 ;; MSG SIZE rcvd: 531

This is strange, as when I run the daemon manually I am using the exact same options as the stubby.service file uses, so I can't work out why it would behave like this.

I have zero-configuration DNSSEC enabled in the stubby.yml config file

hanvinke commented 6 years ago

What does stubby -i say?

eccgecko commented 6 years ago

stubby -i output is as follows (Apologies, I tried formatting the following in the code, but it was not working very well as none of the line breaks worked and it looked very hard to read):

[20:54:35.422324] STUBBY: Read config from file /etc/stubby.yml { "all_context": { "add_warning_for_bad_dns": GETDNS_EXTENSION_FALSE, "appdata_dir": <bindata of "/root/.getdns/">, "append_name": GETDNS_APPEND_NAME_TO_SINGLE_LABEL_FIRST, "dns_transport_list": [ GETDNS_TRANSPORT_TLS ], "dnssec_allowed_skew": 0, "dnssec_return_all_statuses": GETDNS_EXTENSION_FALSE, "dnssec_return_full_validation_chain": GETDNS_EXTENSION_FALSE, "dnssec_return_only_secure": GETDNS_EXTENSION_FALSE, "dnssec_return_status": GETDNS_EXTENSION_TRUE, "dnssec_return_validation_chain": GETDNS_EXTENSION_FALSE, "edns_client_subnet_private": 1, "edns_cookies": GETDNS_EXTENSION_FALSE, "edns_do_bit": 0, "edns_extended_rcode": 0, "edns_version": 0, "follow_redirects": GETDNS_REDIRECTS_FOLLOW, "hosts": <bindata of "/etc/hosts">, "idle_timeout": 10000, "limit_outstanding_queries": 0, "max_backoff_value": 1000, "namespaces": [ GETDNS_NAMESPACE_LOCALNAMES, GETDNS_NAMESPACE_DNS ], "resolution_type": GETDNS_RESOLUTION_STUB, "resolvconf": <bindata of "/etc/resolv.conf">, "return_both_v4_and_v6": GETDNS_EXTENSION_FALSE, "return_call_reporting": GETDNS_EXTENSION_FALSE, "round_robin_upstreams": 1, "specify_class": 1, "suffix": [], "timeout": 5000, "tls_authentication": GETDNS_AUTHENTICATION_REQUIRED, "tls_backoff_time": 3600, "tls_cipher_list": <bindata of "TLS13-AES-256-GCM-SHA384:TLS13-A"...>, "tls_connection_retries": 2, "tls_query_padding_blocksize": 256, "trust_anchors_url": <bindata of "http://data.iana.org/root-anchor"...>, "trust_anchors_verify_CA": <bindata of 0x2d2d2d2d2d424547494e204345525449...>, "trust_anchors_verify_email": <bindata of "dnssec@iana.org">, "upstream_recursive_servers": [ { "address_data": <bindata for 1.1.1.1>, "address_type": <bindata of "IPv4">, "tls_auth_name": <bindata of "cloudflare-dns.com">, "tls_pubkey_pinset": [ { "digest": <bindata of "sha256">, "value": <bindata of yioEpqeR4WtDwE9YxNVnCEkTxIjx6EEIwFSQW+lJsbc=> } ] }, { "address_data": <bindata for 1.0.0.1>, "address_type": <bindata of "IPv4">, "tls_auth_name": <bindata of "cloudflare-dns.com">, "tls_pubkey_pinset": [ { "digest": <bindata of "sha256">, "value": <bindata of yioEpqeR4WtDwE9YxNVnCEkTxIjx6EEIwFSQW+lJsbc=> } ] } ] }, "api_version_number": 132058112, "api_version_string": <bindata of "December 2015">, "compilation_comment": <bindata of "getdns 1.4.1 configured on 2018-"...>, "default_hosts_location": <bindata of "/etc/hosts">, "default_resolvconf_location": <bindata of "/etc/resolv.conf">, "default_trust_anchor_location": <bindata of "/opt/stubby/etc/unbound/getdns-r"...>, "implementation_string": <bindata of "https://getdnsapi.net">, "listen_addresses": [ { "address_data": <bindata for 127.0.2.2>, "address_type": <bindata of "IPv4">, "port": 5353 } ], "openssl_build_version_number": 269484143, "openssl_built_on": <bindata of "built on: reproducible build, da"...>, "openssl_cflags": <bindata of "compiler: gcc -DDSO_DLFCN -DHAVE"...>, "openssl_dir": <bindata of "OPENSSLDIR: "/usr/lib/ssl"">, "openssl_engines_dir": <bindata of "ENGINESDIR: "/usr/lib/arm-linux-"...>, "openssl_platform": <bindata of "platform: debian-armhf">, "openssl_version_number": 269484143, "openssl_version_string": <bindata of "OpenSSL 1.1.0f 25 May 2017">, "resolution_type": GETDNS_RESOLUTION_STUB, "version_number": 17039616, "version_string": <bindata of "1.4.1"> } Result: Config file syntax is valid.

I’m guessing it’s something to do with DNSSEC stuff towards the top? My question would be why it would make a difference whether or not the daemon is started manually or via systemd how this works, when they apply the same config file?

wtoorop commented 6 years ago

@eccgecko Zero configuration DNSSEC needs a writeable appdata_dir directory. When none is configured, it defaults to the home directory of the UID running the stubby process. I noticed on my arch linux system this is the for the stubby user unwriteable / directory:

[root@bunker ~]# echo ~stubby
/

I managed to fix it by including a writeable appdata_dir in /etc/stubby/stubby.yml:

[root@bunker ~]# grep appdata_dir /etc/stubby/stubby.yml
appdata_dir: "/run/stubby"

/run/stubby was already writeable for userstubby on my system:

[root@bunker ~]# ls -ld /run/stubby/
drwxrwx--- 2 root stubby 100 Apr 30 17:23 /run/stubby/

After doing a query, I noticed the AD bit in the result, and also that Zero configuration DNSSEC succeeded since it downloaded the root trust-anchor and root DNSKEY rrset to track in the appdata_dir:

[root@bunker ~]# ls -l /run/stubby/
total 12
-rw------- 1 stubby stubby 4095 Apr 30 17:23 root-anchors.p7s
-rw------- 1 stubby stubby  651 Apr 30 17:23 root-anchors.xml
-rw------- 1 stubby stubby 1659 Apr 30 17:23 root.key

@ArchangeGabriel I think it would be good to have that appdata_dir setting in the stubby.yml file by default...

ArchangeGabriel commented 6 years ago

We are supposed to have DNSSEC working OOTB already on Arch (I build getdns with --with-trust-anchor=/etc/trusted-key.key and the file is supposed to exist since its provided by a getdns dependency on Arch)… And I currently provide “upstream” stubby.yml, if it could stay the case that would be nice. So I would say answering https://github.com/getdnsapi/stubby/issues/62 and then adding appdata_dir: "/run/stubby" to the default config.

But @eccgecko is not on Arch anyway.

wtoorop commented 6 years ago

@ArchangeGabriel acknowledged. Alternatively you could give the stubby user a writeable home directory...

wtoorop commented 6 years ago

@ArchangeGabriel Oh yes... as a side note, having Zero configuration DNSSEC working would be more robust on systems that haven't been updated when the KSK rolls over.

hanvinke commented 6 years ago

When adding _appdatadir: "/run/stubby" to /etc/stubby/stubby.yml and doing a sudo systemctl daemon-reload and restart of service I still have output:

echo ~stubby /home/han/.getdns

although stubby -i shows _"appdatadir": <bindata of "/run/stubby/">

Any clue?

ArchangeGabriel commented 6 years ago

That’s normal, you did not change the stubby user home folder (that would require editing /etc/passwd).

hanvinke commented 6 years ago

Thanks, learning every day here.. 🙂

eccgecko commented 6 years ago

@wtoorop Thanks. That makes sense. @ArchangeGabriel is correct, I am not running Arch but the Raspbian flavor of Debian, but I do, like you, also have the systemd service to execute as user = stubby whereas when I run the daemon manually I am using sudo to start it, so I suppose then it is able to download the necessary trust anchor files. Having said that, I am trying to add appdata_dir: "/run/stubby"to my stubby.yml config file, but I am obviously doing something wrong, as when I try to start stubby after adding this line, I am told there is a generic error:

"Generic error" Could not parse config file "/etc/stubby.yml": Generic error

Sorry if I am being dense here - what exactly is the correct method for inserting this line?

hanvinke commented 6 years ago

Just adding something like this at the top [at line 24-25 for example] should work:

# Include a writeable appdata_dir for Zero configuration DNSSEC. appdata_dir: "/run/stubby/"

Watch out for spacing.

hanvinke commented 6 years ago

@wtoorop @ArchangeGabriel Thank you both for finding the solution why Zero configuration DNSSEC didn't work for me before, I never saw any files appear.

After adding appdata_dir: "/var/run/stubby" it works fine now. I use a slightly little different configuration stubby.service with:

[Unit] Description=stubby DNS resolver

[Service] ExecStart=/usr/bin/stubby DynamicUser=yes RuntimeDirectory=stubby AmbientCapabilities=CAP_NET_BIND_SERVICE CapabilityBoundingSet=CAP_NET_BIND_SERVICE

[Install] WantedBy=multi-user.target

Since systemd 235 this DynamicUser configuration is possible. And it works very well. It is kind of magic to see the folder appear out of nothing creating root.key, root-anchors.p7s and root-anchors.xml.

wtoorop commented 6 years ago

@hanvinke Glad to hear and glad to be of help :) @eccgecko We currently also have a bug with string configuration options. I will do a release candidate tomorrow, so you can try that one. You could provide your stubby.yml for us to check, just to be sure it is not something in the syntax..

eccgecko commented 6 years ago

@wtoorop I managed to get it working in the end by adding it around line 24-25 like you said (I was adding it on at the end before).

However, unfortunately it hasn't managed to fix my issues with DNSSEC zero-config. In fact, I don't know how excatly, but it even seems to have made it slightly worse, as now dig @127.0.2.2 -p 5353 www.dnssec-failed.org is getting a reply even when I run the daemon manually as sudo, which it wasn't replying to before.

I added both appdata_dir: "/var/run/stubby" and appdata_dir: "/run/stubby" (not at the same time; one at a time when the first didn't work) to my stubby.yml config file. Neither does the trick. Looking in the run/stubby folder, I don't see any trust anchor files being downloaded either, so it's definitely not doing the same thing as @hanvinke 's config seems to be achieving :(

wtoorop commented 6 years ago

@eccgecko Oh that's a pity. Could you do a sudo -u stubby stubby -i and copy paste the output maybe?

eccgecko commented 6 years ago

Sure. Again, apologies that I can't seem to get the formatting right. sudo -u stubby /opt/stubby/bin/stubby -C /etc/stubby.yml -i output is as follows:

[12:51:16.164408] STUBBY: Read config from file /etc/stubby.yml { "all_context": { "add_warning_for_bad_dns": GETDNS_EXTENSION_FALSE, "appdata_dir": <bindata of "var/run/stubby">, "append_name": GETDNS_APPEND_NAME_TO_SINGLE_LABEL_FIRST, "dns_transport_list": [ GETDNS_TRANSPORT_TLS ], "dnssec_allowed_skew": 0, "dnssec_return_all_statuses": GETDNS_EXTENSION_FALSE, "dnssec_return_full_validation_chain": GETDNS_EXTENSION_FALSE, "dnssec_return_only_secure": GETDNS_EXTENSION_FALSE, "dnssec_return_status": GETDNS_EXTENSION_TRUE, "dnssec_return_validation_chain": GETDNS_EXTENSION_FALSE, "edns_client_subnet_private": 1, "edns_cookies": GETDNS_EXTENSION_FALSE, "edns_do_bit": 0, "edns_extended_rcode": 0, "edns_version": 0, "follow_redirects": GETDNS_REDIRECTS_FOLLOW, "hosts": <bindata of "/etc/hosts">, "idle_timeout": 10000, "limit_outstanding_queries": 0, "max_backoff_value": 1000, "namespaces": [ GETDNS_NAMESPACE_LOCALNAMES, GETDNS_NAMESPACE_DNS ], "resolution_type": GETDNS_RESOLUTION_STUB, "resolvconf": <bindata of "/etc/resolv.conf">, "return_both_v4_and_v6": GETDNS_EXTENSION_FALSE, "return_call_reporting": GETDNS_EXTENSION_FALSE, "round_robin_upstreams": 1, "specify_class": 1, "suffix": [], "timeout": 5000, "tls_authentication": GETDNS_AUTHENTICATION_REQUIRED, "tls_backoff_time": 3600, "tls_cipher_list": <bindata of "TLS13-AES-256-GCM-SHA384:TLS13-A"...>, "tls_connection_retries": 2, "tls_query_padding_blocksize": 256, "trust_anchors_url": <bindata of "http://data.iana.org/root-anchor"...>, "trust_anchors_verify_CA": <bindata of 0x2d2d2d2d2d424547494e204345525449...>, "trust_anchors_verify_email": <bindata of "dnssec@iana.org">, "upstream_recursive_servers": [ { "address_data": <bindata for 1.1.1.1>, "address_type": <bindata of "IPv4">, "tls_auth_name": <bindata of "cloudflare-dns.com">, "tls_pubkey_pinset": [ { "digest": <bindata of "sha256">, "value": <bindata of yioEpqeR4WtDwE9YxNVnCEkTxIjx6EEIwFSQW+lJsbc=> } ] }, { "address_data": <bindata for 1.0.0.1>, "address_type": <bindata of "IPv4">, "tls_auth_name": <bindata of "cloudflare-dns.com">, "tls_pubkey_pinset": [ { "digest": <bindata of "sha256">, "value": <bindata of yioEpqeR4WtDwE9YxNVnCEkTxIjx6EEIwFSQW+lJsbc=> } ] } ] }, "api_version_number": 132058112, "api_version_string": <bindata of "December 2015">, "compilation_comment": <bindata of "getdns 1.4.1 configured on 2018-"...>, "default_hosts_location": <bindata of "/etc/hosts">, "default_resolvconf_location": <bindata of "/etc/resolv.conf">, "default_trust_anchor_location": <bindata of "/opt/stubby/etc/unbound/getdns-r"...>, "implementation_string": <bindata of "https://getdnsapi.net">, "listen_addresses": [ { "address_data": <bindata for 127.0.2.2>, "address_type": <bindata of "IPv4">, "port": 5353 } ], "openssl_build_version_number": 269484143, "openssl_built_on": <bindata of "built on: reproducible build, da"...>, "openssl_cflags": <bindata of "compiler: gcc -DDSO_DLFCN -DHAVE"...>, "openssl_dir": <bindata of "OPENSSLDIR: "/usr/lib/ssl"">, "openssl_engines_dir": <bindata of "ENGINESDIR: "/usr/lib/arm-linux-"...>, "openssl_platform": <bindata of "platform: debian-armhf">, "openssl_version_number": 269484143, "openssl_version_string": <bindata of "OpenSSL 1.1.0f 25 May 2017">, "resolution_type": GETDNS_RESOLUTION_STUB, "version_number": 17039616, "version_string": <bindata of "1.4.1"> } Result: Config file syntax is valid.

ArchangeGabriel commented 6 years ago

You’re missing a leading / in appdata_dir.

wtoorop commented 6 years ago

Yes that's probably it... and also make sure /var/run/stubby (with leading slash) is writable (and readable) for user stubby.

eccgecko commented 6 years ago

Thanks, that's pretty much solved it. I think we're very close to completely solving it. Yes, the missing leading / was part of the problem, although that was a mistake I made when I changed it from /run/stubby to var/run/stubby. There had been a leading / when I just used /run/stubby. I have changed it to /var/run/stubby now.

However, the main issue is with permissions on the folder. I believed the folder permissions were already correct, as they had been when I had checked before. However, it seems that they aren't persisting through a reboot, and that's the problem I've been facing, as I hadn't checked again since the first time I checked. Changing the permissions of /var/run/stubby so that stubby group has read and write permissions solves the issue, and www.dnssec-failed.org no longer replies, and there are indeed trust anchor files created within the /var/run/stubby folder 👍 :)

However, when I reboot, the permissions revert to only being read, write, execute for root, and just executable for the stubby group i.e. stubby user.

How can I make permissions for /var/run/stubby persistent?

ArchangeGabriel commented 6 years ago

@eccgecko What are the permissions before you change them? Do you know how this folder is created on your system?

wtoorop commented 6 years ago

Acknowledged. This is due to the line d /run/stubby 0750 root stubby - - in /usr/lib/tmpfiles.d/stubby.conf. Change that line to d /run/stubby 0770 root stubby - - and it comes back with correct permissions.

eccgecko commented 6 years ago

@wtoorop That's it! 👍 great! DNSSEC now working and persisting through reboots. Thank you and @ArchangeGabriel for all your help with this :)

hanvinke commented 6 years ago

My stubby.service example needs some attention. I was not completely sure about the use of RuntimeDirectory and StateDirectory. Although it works, only one of them is needed as it turns out. Sorry I had to scratch the information together.

With only RuntimeDirectory you wil have a volatile directory /var/run/stubby. When the service quits it removes all, including the directory. Strong advice: if present first remove a leftover /var/run/stubby directory from another install, since it might have the wrong permissions set. Systemd will now take full care of folder and file creation and their permissions. You need to set appdata_dir: /var/run/stubby in stubby.yml.

With only StateDirectory you will have a persistent directory, so after a reboot it keeps the Zero configuration DNSSEC files intact. You need to set appdata_dir: /var/lib/stubby in stubby.yml.

More information of the benefits of a DynamicUser here: "http://0pointer.net/blog/dynamic-users-with-systemd.html"

BTW my stubby.yml used for testing is very basic:

stubby.txt

(This one is for use with StateDirectory=stubby )

[I edited my previous stubby.service above]

hanvinke commented 6 years ago

Prior to using DynamicUser it is important that any existing user stubby and group stubby have to be removed also. I forgot to do that yesterday and got this morning with StateDirectory=stubby active an error I never saw before: screenshot from 2018-05-04 07-37-09

After removing user:stubby and group:stubby and a reboot all was fine again: same as root

wtoorop commented 6 years ago

@hanvinke Thanks for pointing out the DynamicUser configuration of systemd! I like it a lot! I believe the error you had could have been prevented when a User=stubby had been left in the [Service] section of stubby.service. In fact Lennart points out that you should do that when upgrading from a static UID setup in the 6th Note in the Notes section of is blog post.

I'll play with these settings a bit and will include it in the getdns 1.4.2-rc1 release candidate today (which will have a stubby 0.2.3-rc1 release candidate on board).

hanvinke commented 6 years ago

Testing right now the new release candidate 😃 ! With many servers enabled I unfortunately got the message:

$ stubby -i [19:11:47.569013] STUBBY: Read config from file /etc/stubby/stubby.yml stubby: ./gldns/gbuffer.h:285: gldns_buffer_skip: Assertion `buffer->_position + count <= buffer->_limit || buffer->_vfixed' failed. Aborted (core dumped)

So I used only the default enabled DNS recursive servers in stubby.yml, and that gave no errors. Buffer overflow of some kind?

wtoorop commented 6 years ago

Ouch!

That's really bad. I just tried with around 500000. It was really slow to parse, but it did. Would it be possible for you to provide a core dump from a stubby and libgetdns compiled with CFLAGS="-g"? That would be very helpful to debug the issue. You could also send me the stubby.yml, just to be sure. You can send it to me by e-mail encrypted with my PGP key?

eccgecko commented 6 years ago

Sorry to be the bearer of bad news, but unfortunately, after updating to latest getdns 1.4.2 with the latest commit e0e8576 for the stubby 0.2.3 submodule, this problem has resurfaced for me.

As far as I can tell, I am now using the new default options regarding systemd and the working_app_dir.

In my stubby.service file I have WorkingDirectory=/var/cache/stubby And in my config stubby.yml file I have appdata_dir: "/var/cache/stubby"

I also succesfully have the following in my /usr/lib/tmpfiles.d/stubby.conf file: # tmpfiles.d (5) for use with stubby.service d /var/cache/stubby 0750 stubby stubby - - The daemon seems to have successfully created the root-anchors.p7s root-anchors.xml root.key files in /var/cache/stubby, but that is most likely from when I ran the binary as sudo. The permissions on the /var/cache/stubby directory seem to be in order: drwxr-x--- 2 stubby stubby

However, it's exact same issue as before, with dnssec failing when run as the systemd service, but it's successful when the binary is run as sudo. Is it to do with it now using /var/cache/stubby? Your advice before was to use /var/run/stubby. I will probably change it back to this to see if that works, but ideally I wanted to use the defaults as much as possible.

abelbeck commented 6 years ago

@eccgecko On your system what is ls -ld /var/cache

ArchangeGabriel commented 6 years ago

Can you paste your full stubby.service? If you use DynamicUser, then you must not have a /usr/lib/tmpfiles.d/stubby.conf at all (so remove it), and you should delete the /var/cache/stubby folder before restarting the service after that.

hanvinke commented 6 years ago

Also delete the contents of /var/cache/private/stubby when retesting Zero configuration DNSSEC, since the folder /var/cache/stubby is just a symlink (owned by root) to /var/cache/private/stubby.

eccgecko commented 6 years ago

ls -ld /var/cache outputs the following: drwxr-xr-x 11 root root 4096 May 17 17:14 /var/cache

My stubby.service file looks like this:

[Unit] Description=stubby DNS resolver [Service] User=stubby DynamicUser=yes CacheDirectory=stubby WorkingDirectory=/var/cache/stubby ExecStart=/opt/stubby/bin/stubby -C /etc/stubby.yml AmbientCapabilities=CAP_NET_BIND_SERVICE CapabilityBoundingSet=CAP_NET_BIND_SERVICE [Install] WantedBy=multi-user.target

This is the default stubby.service file. The only change I made was to add -C /etc/stubby.yml to the ExecStart line.

I did as you said @ArchangeGabriel and deleted both /usr/lib/tmpfiles.d/stubby.conf and /var/cache/stubby and even ran sudo systemctl disable stubby and deleted the /lib/systemd/system/stubby.service file, then started again by adding it back and re-enabling. Unfortunately it's no-go. In fact, the behaviour is slightly different to before. Now systemd fails at starting the service at all (instead of simply starting but dnssec not working, as before) and I believe it is because it cannot create the /var/cache/stubby directory. When I run the stubby binary as sudo, the daemon starts and /var/cache/stubby is created, and dnssec works.

One thing I did notice is that WorkingDirectory=/var/cache/stubby and appdata_dir: "/var/cache/stubby" are different to what @wtoorop recommended before (/var/run/stubby). Could that be related? I want to keep it as close to default config as possible so haven't changed this yet.

@hanvinke my system doesn't seem to have any /var/cache/private/ directory at all. Could that also be related?

ArchangeGabriel commented 6 years ago

@eccgecko What systemd version? Doesn’t give any outputs in status on failure to start Stubby?

hanvinke commented 6 years ago

@eccgecko Maybe the user name selected by systemd (stubby) already exists on your system? Systemd will not operate in dynamic user mode otherwise. Better to delete any existing stubby user or group first before restarting the service.

eccgecko commented 6 years ago

@ArchangeGabriel ah...I guess that's the issue. I'm on the default Raspbian systemd package, which at present is 232, unfortunately. I see from the blog @hanvinke referenced, DynamicUser was introduced in 235. I did try upgrading my systemd package by downloading the 238 package from the buster repo, but unfortunately this broke my system and I had to restore from a backup.

I guess it's just a case of removing DynamicUser from the stubby.service file?

ArchangeGabriel commented 6 years ago

Indeed. Actually DynamicUser was introduced in 232, but Stubby uses related features introduced in 235. So yes, in this case you have to use the tmpfiles.d snippet to create /var/cache/stubby with the right permissions. And you should indeed remove the DynamicUser from the service file.

eccgecko commented 6 years ago

Thanks @ArchangeGabriel @hanvinke for your help. I removed the DynamicUser=yes line from stubby.service and used the tmpfiles.d snippet to create /var/cache/stubby and am now back up and running successfully using the stubby daemon as a systemd service with stubby as the user and with DNSSEC successfully working :) thanks again 👍

Not wanting to push my luck, so apologies if this is the wrong place, but just wanted to ask one additional question relating to stubby and DNSSEC. Is there a reason why, when running a DNSSEC algorithm test here https://rootcanary.org/test.html, ED25519 is not validated as an algorithm, but when using dnsmasq with DNSSEC, it is?

screen shot 2018-05-18 at 18 11 53

hanvinke commented 6 years ago

@eccgecko Sorry, I cannot help you with ED25519 support for stubby. I think it is not implemented yet.

For the enthusiasts I have made the stubby service file optimized for security. This is because f.i. the ReadWritePaths= was not added to the original file. Some info: http://0pointer.net/blog/avoiding-cve-2016-8655-with-systemd.html

My stubby.service: stubby.service.TXT

ArchangeGabriel commented 6 years ago

Well I’m not sure… ED25519 is supported for the TLS connection, but maybe not for DS signing.

ArchangeGabriel commented 6 years ago

@hanvinke If you can put a PR with comments on each added line, that would be very welcomed I think.

hanvinke commented 6 years ago

@ArchangeGabriel Thank you for your interest!

I edited my previous file a little. Removed: PrivateTmp=yes and ProtectSystem=strict Reason: Both are already implied by DynamicUser=yes

I also changed Umask setting to 077, and ProtectHome to yes, which are much more restrictive. Maybe someone can tell me if Stubby needs also to have @aio [Asynchronous I/O (io_setup(2), io_submit(2), and related calls)] in the SystemCallFilter. More information about the systemd settings can be found on https://www.mankier.com/5/systemd.exec

hanvinke commented 6 years ago

Edited a third time - decided to remove all whitelisted systemcalls (the ones with a @ before it), because f.i.

ptrace: Already is blocked by dropping CAP_PTRACE under CapabilityBoundingSet aio: There is no need for stubby to get access to any IO port and is already blocked by dropping CAP_SYS_RAWIO under CapabilityBoundingSet

_AmbientCapabilities=CAP_NET_BINDSERVICE can only be emitted when you do not online banking, otherwise f.i. payment transactions with Ideal will fail without it. For now I am just keeping SystemCallFilter= ~madvise (The tilde after the equal sign indicates that this is a blacklist of syscalls)

wtoorop commented 6 years ago

Thank you all!

@ArchangeGabriel you say systemd 232 will not start the service if it encounters an for it unkown (i.e. DynamicUser) directive? I assumed it just would because I is allowed with systemd version 238...

I suppose we have to provide two stubby.service files (one for systemd before 235 and one for systemd 235 and higher), but perhaps there was something else going on...

@eccgecko I believe most of the groundwork for ED25519 has already been done. I'll see if I can enable the ED25519 and ED448 with newer OpenSSL and let you know (provide patch).

ArchangeGabriel commented 6 years ago

@wtoorop No, I’m saying that it handled the DynamicUser correctly (since it is a 232 feature), but not the custom directory part. Thus, the service started as DynamicUser, but could not write files, and zero-conf DNSSEC failed.

wtoorop commented 6 years ago

@ArchangeGabriel Ok, so the CacheDirectory directive wouldn't work, but I supposed (wrongly?) that that wouldn't matter since that directory would have been created with /usr/lib/tmpfiles.d/stubby.conf vanyway...

ArchangeGabriel commented 6 years ago

No, I think a DynamicUser still doesn’t have the right to write to standard directory even if it owns it because the user is in fact not dynamic and the directory attributed to it. But I might be wrong, and this could also be a bug of some kind.

In any case, problematic systems in this regard would be the one with 232 ≤ systemd < 235.

eccgecko commented 6 years ago

@wtoorop it may be necessary to provide 2 different stubby.service files, as, with DynamicUser included, my systemd 232 failed to start at all with that line included. Or are you @ArchangeGabriel saying that it was inclusion of both DyanmicUser and custom directory that was causing the issue? Because I only removed DynamicUser and nothing else, and that got it going again.

Stretch, the current stable dist of Debian / Raspbian, only ships systemd 232, so it may be necessary to make some allowances for that user-base. I for one have now switched from Raspbian to Arch, mostly because of the out of date packages that Debian has, and a lot of the issues I've been experiencing with stubby and other projects I'm running on my pi have been fixed by updated packages. Since my migration to arch, stubby now runs fine with the default systemd service file :)

hanvinke commented 6 years ago

While testing logging with Stubby through 'stubby -v 7' I noticed that dns-tls.bitwiseshift.net has a problem currently, stubby nicely reporting: STUBBY: 81.187.221.24 : Verify failed : TLS - Failure - (10) "certificate has expired". Also gnutls-cli --print-cert -p 853 81.187.221.24 shows the same problem. Possible cause is that of recently cerbot no longer checks if the certificate is about to expire? Where can I comment there is a problem with this server?

saradickinson commented 6 years ago

@hanvinke I've pinged the operator directly as they haven't made their contact details public. To double check for issues you can find monitoring of the servers here: https://dnsprivacy.org/jenkins/job/dnsprivacy-monitoring/

karavan commented 3 years ago

When DNSSEC not working, also check system date.