Closed danb35 closed 3 years ago
Due to responses like these https://forum.opnsense.org/index.php?topic=18253.msg84616#msg84616 I'm quite tempted to ignore the ticket completely.
Lot's of people try to help others in their spare time, which doesn't always mean that your question is on the top of everybody's list.
Using google a bit on "ntpd exiting on signal 15" shows some interesting topics on the subject, ntpd as a time server is quite sensitive about .... time... which is probably why it kills itself when time drifts too much. Often these things seem to happen when systems have an unstable rtc clock. We haven't seen similar issues ourselves on quality hardware.
Some resources about similar issues on other platforms:
Unfortunately I couldn't find documentation about this behaviour from ntpd itself, it could be unintended (or should have outputted more information), but it does look quite common.
Specifically about your hardware you can find the following (same area, don't know if it's the cause, but sounds similar):
I don't know if chrony handles rtc issues differently, it might conceal the underlaying issue(s), but there's also a chance that it treats those similar. It might be worth experimenting for people suffering from this issue, from a core perspective there's not much we can (and plan) to do in this case.
Due to responses like these https://forum.opnsense.org/index.php?topic=18253.msg84616#msg84616 I'm quite tempted to ignore the ticket completely.
Thats a very unprofessional comment to make. The merrit of a github issue has nothing at all to do with the attitude of the one submitting the issue.
Quoting only one sentence makes it look like this, but it's not true when reading the rest of it :)
The error is sometimes hard to track when it's not reproducable and it needs time and more manpower. So the interim workaround can be trying chrony and working on the issue in parallel
One lead for debug is why syslog-ng is not running/crashes. EDIT: Looks like two instances could be started (occasionally?) at boot: $ ps auxf | grep syslog root 33928 0.0 0.8 19304 7824 - I 20:18 0:00.00 /usr/local/sbin/syslog-ng -p /var/run/syslog-ng.pid root 39227 0.0 1.0 26444 9916 - Ss 20:18 0:00.16 /usr/local/sbin/syslog-ng -p /var/run/syslog-ng.pid root 39857 0.0 1.1 21080 11212 - S 20:18 0:00.10 /usr/local/bin/python3 /usr/local/opnsense/scripts/syslog/lockout_handler (python3.7) root 82382 0.0 0.3 1061316 3396 - Ss 20:18 0:00.08 /usr/local/sbin/syslogd -s -c -c -P /var/run/syslog.pid -p /var/run/legacy_log -S /var/run/legacy_logpriv -k -s -s -f /var/et root 84149 0.0 0.3 1067348 3268 - Is 20:18 0:00.00 dhclient: system.syslog (dhclient) $ cat /var/run/syslog-ng.pid 39227
Do you have a working DNS?
Jul 31 09:16:51 opnsense ntpdate[12533]: Can't find host 0.pool.ntp.org: Name does not resolve (8)
Jul 31 09:16:54 opnsense ntpdate[12533]: Can't find host 2.pool.ntp.org: Name does not resolve (8)
Jul 31 09:16:56 opnsense ntpdate[12533]: no servers can be used, exiting
Jul 31 09:17:25 opnsense ntpdate[94626]: Can't find host 0.pool.ntp.org: Name does not resolve (8)
The OPNsense box was working fine as DNS for the rest of my network. But I've long since given up on this and moved back to pfSense.
This issue has been automatically timed-out (after 180 days of inactivity).
For more information about the policies for this repository, please read https://github.com/opnsense/core/blob/master/CONTRIBUTING.md for further details.
If someone wants to step up and work on this issue, just let us know, so we can reopen the issue and assign an owner to it.
It looks like DNS doesn't work, so NTP cannot resolve domain names to IP addresses and PC has wrong time. DNS doesn't work because PC has wrong time!! That is a deadlock... Time has to be fixed manually, that will restore DNS function and after that NTP will be able to fetch correct time. DNS doesn't work because signed domains needs correct time to use certificate to validate domains; when time is wrong, validation of DNS replies fails... Other workaround could be that one of NTP servers for initial time sync will be defined by IP address, like 162.159.200.123 (time.cloudflare.com)
Correct time is important these days, because when time is wrong, certificates are no valid. When time on PC is not correct, many thing break, it is not possible to update, validating DNS server could be broken, etc...
DNS doesn't work because PC has wrong time!!
What?!
Time has to be fixed manually
I'd say DNS has to be fixed manually if it was broken. ;)
Other workaround could be that one of NTP servers for initial time sync will be defined by IP address
Workarounds are impractical for default use.
I'd say this easily falls under what is called system administration. CMOS battery is likely flaky if the time is not recorded correctly...
Cheers, Franco
Try to set wrong time to your router and reboot. Check if time was corrected during boot. Check if DNS works when time is wrong... Similar issue was reported on IPfire running on Raspberry PI without RTC module, just few days ago... Unbound fails to validate DNS replies when time is not correct, it rejects all DNS replies as invalid...
Try to set wrong time to your router and reboot.
Sorry, but why?
Try to set wrong time to your router and reboot.
Sorry, but why?
Sorry, forget a about that...
Important notices Before you add a new report, we ask you kindly to acknowledge the following:
[X] I have read the contributing guide lines at https://github.com/opnsense/core/blob/master/CONTRIBUTING.md
[X] I have searched the existing issues and I'm convinced that mine is new.
Describe the bug From my initial installation on OPNSense 20.1.8 to current on 20.7, ntpd hasn't started and stayed running for more than a few seconds. More detail in this forum thread.
It appears that on boot,
ntpdate_sync_once.sh
runs and successfully synchronizes the time, after whichntpd
is started--as it should be. But after a very brief amount of time,ntpd
is killed with signal 15.To Reproduce Steps to reproduce the behavior: As above--boot the system and monitor ntpd status. It may start briefly, but won't stay that way.
Expected behavior ntpd starts and runs.
Screenshots
Relevant log files
/var/log/ntpd.log
Additional context Add any other context about the problem here.
Environment
OPNsense 20.1.8 - 20.7 (amd64, OpenSSL). Intel® Celeron C2358 Netgate RCC-VE-2440