Closed R-Adrian closed 4 months ago
This time skew too far on boot also seems to corrupt collectd RRA files. Start sequence for ntp in OpenWRT way too late, much later than collectd itself, collectd always started before ntp initialize system time and in my test case system date set to year 2033 when collectd started, rendering all RRA files stop updating after system reboot.
==>My current work around is disable collectd auto start and start it via rc.local with 120secs sleep seems work fine [far from perfect, if wan links take more than 120secs to come out, RRA toast again].
add into rc.local: (sleep 120; /etc/init.d/collectd start) &
Ideally, collectd startup scripts in OpenWRT should include ntp service detection. if ntp is configured and enabled, collectd should wait in the background for ntp sync to complete before proceed. This NTP issue typically should not causes much issues if default time is skew backward like year 1990, but strangely OpenWRT default time forward to year 2033, is this bugs?
Underlying issue reproduced (and fixed, PR to come soon) in /etc/init.d/dnsmasq. Roughtime and other proper fixes to get a rough better system time earlier are not in ntpd's scope, either.
I suggest this bug can be closed.
PS: patch for the dnsmasq issue sent to FS#2574 (bugs.openwrt.org)
Related pull request here: https://github.com/openwrt/openwrt/pull/2801
Maintainer: @tripolar Environment:
Problem description: Most routers do not have a built-in hardware clock and try to obtain the time through NTP at boot time. If DNSSEC is also configured and NSEC validation enforced then this becomes impossible, because the router will not be capable of obtaining an usable time reference since it cannot do proper secure DNS resolution for the time servers hostnames.
relevant bits from /etc/config/dhcp that turn NTPD into a dead duck at boot and which in turn causes the entire DNSSEC resolution to fail because of time differences are:
Tentative solution: Would it be possible to adjust the startup script of NTPD so that it first tries to obtain a rough time reference from somewhere, without relying on the time server hostnames configured in /etc/config/system? /etc/init.d/sysfixtime is useless when the router doesn't have a built-in hardware clock.
Maybe query a couple of times at boot one time server that has a static ip address, to obtain an usable time reference so that DNSSEC validation can be bootstrapped later on?
probably possible to use here: Google time servers https://time.google.com Cloudflare time servers https://time.cloudflare.com
These servers are members of the NTP pool project and they have fixed IP addresses published in DNS for worldwide use: Google: 216.239.35.0 216.239.35.4 216.239.35.8 216.239.35.12 2001:4860:4806:0:0:0:0:0 2001:4860:4806:4:0:0:0:0 2001:4860:4806:8:0:0:0:0 2001:4860:4806:c:0:0:0:0
Cloudflare: 162.159.200.1 162.159.200.123 2606:4700:f1:0:0:0:0:1 2606:4700:f1:0:0:0:0:123
Or maybe implement somehow the secure RoughTime protocol for obtaining a reliable, rough time at boot? https://roughtime.googlesource.com/roughtime https://blog.cloudflare.com/roughtime/
Note: i also opened a related bug for Busybox NTPD (base system package) since that is also affected by a similar issue. https://bugs.openwrt.org/index.php?do=details&task_id=2574