Closed macifell closed 1 year ago
@pop-os/quality-assurance please review
It appears that certain ISPs or company firewalls are incorrectly set up to block either the NTS Key Exchange or the larger NTS enhanced NTP packets.
I tried to replicate the issue of NTS being blocked by putting a machine behind a firewall blocking port 4460 outbound. I confirmed the firewall was working using nc
(the connection to one of our time servers on port 4460 succeeded without the firewall and failed with the firewall.) However, even with the firewall in place, if I turn off automatic date & time, change the time so it's off by a few minutes, then turn on automatic date & time, the time still gets synced. @macifell Do you have a method to recreate the sort of blockage you're referring to?
When NTS is not being blocked our servers will still be preferred and trusted over unauthenticated time servers.
Are you saying that our own servers don't support unauthenticated time? If that's the case, wouldn't people with NTS blocked still fail to sync time until they manually add an unauthenticated time server to their configuration? That is mainly what I was wanting to verify with my own testing.
@jacobgkau So what I saw in a particular case was that the actual NTS enabled NTP packets were being dropped in transit - that is they weren't even making it to our servers. I am assuming this was being done based on size alone, as they are larger than traditional insecure NTP packets. Blocking the key exchange won't always cause a failure because it only happens every now and then. Most of the time the client keeps getting new NTS cookies from the server on each request and can go a long time without another key exchange.
Our servers do support unauthenticated time, but there's no way to configure chrony (or any other NTP client that I'm aware of) to use the same servers for both unauthenticated and authenticated time. However, chrony's default config includes a pool statement that will always pick up insecure servers - these changes just allow them to be used.
However, chrony's default config includes a pool statement that will always pick up insecure servers
Ah, I see, /etc/chrony/chrony.conf
still includes some Ubuntu pools.
Managed to simulate a failure by enabling the firewall, disabling Chrony, clearing out /var/lib/chrony/*
, and rebooting. Witnessed that before this change, I got a line stating NTS-KE session with <ip> (domain) timed out
for each configured System76 source, and the time did not synchronize. After the change, an Ubuntu pool was selected and the time synchronized. I did notice that the Ubuntu pool was selected before the NTS timeouts were logged:
Jan 18 17:52:51 pop-os systemd[1]: Started chrony, an NTP client/server.
Jan 18 17:53:00 pop-os chronyd[6121]: Selected source 74.6.168.73 (0.ubuntu.pool.ntp.org)
Jan 18 17:53:00 pop-os chronyd[6121]: System clock wrong by 184.805457 seconds
Jan 18 17:56:05 pop-os chronyd[6121]: System clock was stepped by 184.805457 seconds
Jan 18 17:56:05 pop-os chronyd[6121]: System clock TAI offset set to 37 seconds
Jan 18 17:56:06 pop-os chronyd[6121]: Source 185.125.190.56 replaced with 185.125.190.58 (ntp.ubuntu.com)
Jan 18 17:56:13 pop-os chronyd[6121]: NTS-KE session with 3.220.42.39:4460 (virginia.time.system76.com) timed out
Jan 18 17:56:13 pop-os chronyd[6121]: NTS-KE session with 15.237.97.214:4460 (paris.time.system76.com) timed out
Jan 18 17:56:13 pop-os chronyd[6121]: NTS-KE session with 52.10.183.132:4460 (oregon.time.system76.com) timed out
Jan 18 17:56:14 pop-os chronyd[6121]: NTS-KE session with 18.228.202.30:4460 (brazil.time.system76.com) timed out
Jan 18 17:56:16 pop-os chronyd[6121]: NTS-KE session with 3.134.129.152:4460 (ohio.time.system76.com) timed out
A few minutes later, I got two more Ubuntu pool selections and one more set of NTS-KE timeouts. The timeouts repeated again about 9 minutes later.
After disabling the firewall and then turning automatic date/time off and on again, I had System76 sources selected, although I did also have a notice that an IP address in the Ubuntu pool changed (so some communication with those may still happen even if the System76 servers are preferred/used?)
Jan 18 18:53:01 pop-os systemd[1]: Starting chrony, an NTP client/server...
Jan 18 18:53:01 pop-os chronyd[6618]: chronyd version 4.2 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +NTS +SECHASH +IPV6 -DEBUG)
Jan 18 18:53:01 pop-os chronyd[6618]: Frequency 14.367 +/- 4.291 ppm read from /var/lib/chrony/chrony.drift
Jan 18 18:53:01 pop-os chronyd[6618]: Using right/UTC timezone to obtain leap second data
Jan 18 18:53:01 pop-os chronyd[6618]: Loaded seccomp filter (level 1)
Jan 18 18:53:01 pop-os systemd[1]: Started chrony, an NTP client/server.
Jan 18 18:53:12 pop-os chronyd[6618]: Selected source 18.228.202.30 (brazil.time.system76.com)
Jan 18 18:53:12 pop-os chronyd[6618]: System clock wrong by -2551.562078 seconds
Jan 18 18:10:41 pop-os chronyd[6618]: System clock was stepped by -2551.562078 seconds
Jan 18 18:10:41 pop-os chronyd[6618]: System clock TAI offset set to 37 seconds
Jan 18 18:10:42 pop-os chronyd[6618]: Source 185.125.190.58 replaced with 185.125.190.57 (ntp.ubuntu.com)
Jan 18 18:10:42 pop-os chronyd[6618]: Selected source 3.134.129.152 (ohio.time.system76.com)
In short, the new config works as intended.
FYI this caused the pop shop to fail updating system packages. Manual invocation of apt upgrade
prompted with conflicts on the changed files on popos 22.04.
@pop-os/quality-assurance can we take a look at this again?
We probably need to remove that file as a debian conffile
I'm not seeing an issue, downgrading the package to the previous commit (manual build/install) and then upgrading again with the Pop!_Shop. Even if I downgrade, make a change to one of the files, and then upgrade again with the Pop!_Shop, the upgrade completes and the file is simply overwritten with the new version. I could be missing a step to recreate the issue.
I am a little confused why we added two zram files in /etc as conffiles in https://github.com/pop-os/default-settings/commit/7fcad352e1fc8733ec0c4c9c787fc737783aad96 since Debian docs suggest everything under /etc is automatically considered a conffile. Not sure if specifying those two manually would affect any other files like these Chrony ones.
IIRC, I did os upgrade from earlier popos version to 22.04. Perhaps that has something to do with it?
IIRC, I did os upgrade from earlier popos version to 22.04. Perhaps that has something to do with it?
I don't see why that would have anything to do with it. 22.04 was released long before the switch to Chrony. I can try to test on an upgraded installation after pop-upgrade is fixed, but it's going to grab this update as part of the release upgrade now that it's been released, so the Pop!_Shop wouldn't have anything to do with it at that point (and pop-upgrade may handle the apt prompt automatically.)
The problem
There have been a few reports of an inability to connect to our time servers using NTS. It appears that certain ISPs or company firewalls are incorrectly set up to block either the NTS Key Exchange or the larger NTS enhanced NTP packets. While this is wrong, it will prevent client machines on these networks from being able to synchronize their clocks - causing them to drift.
Both the
prefer
andmixed
authselect modes in chrony disable unauthenticated sources in the presence of authenticated servers - even if they are not possible to connect to. This is for security reasons, as it would be easy to create a downgrade attack by blocking access to the secure servers, which is basically what the ISPs/firewalls are doing.The resolution
While it is not ideal, these updates allow for chrony to fall back to insecure time in the event that the network is blocking NTS. This significantly degrades security, but is necessary to prevent a regression of behavior. I am planning to add the commands for removing this fallback behavior to the hardening instructions at https://system76.com/time . They are as follows:
Note
When NTS is not being blocked our servers will still be preferred and trusted over unauthenticated time servers.