Openvario / meta-openvario

Official OpenEmbedded layer for Openvario flight computer.
http://www.openvario.org
30 stars 29 forks source link

downloads fail due to wrong system time #276

Open lordfolken opened 2 years ago

lordfolken commented 2 years ago

It takes a very long time for systemd-timesyncd to update the system time.

During that time, its not possible to download any maps etc, via xcsoar filemanager due to the fact that the ssl certificates seem to be in the future.

What should happen: systemd-timesyncd should force synchronization of the time as soon as a network connection is established.

Potentially chrony would be a better choice. Its much more configureable:

makestep, makestep threshold limit
           Normally chronyd will cause the system to gradually correct any time offset, by slowing down or speeding up the clock as required.
           In certain situations, the system clock might be so far adrift that this slewing process would take a very long time to correct the
           system clock.
kedder commented 2 years ago

Synching time on network connection used to work in current stable image.

MaxKellermann commented 2 years ago

What is "a very long time"? Check the journal and timedatectl for more information.

systemd-timesyncd should force synchronization of the time as soon as a network connection is established.

That is what systemd-timesyncd is designed to do, and we need to find out why this takes longer than you expected.

I don't think switching to a different software, without checking the cause of your timesyncd problems, is a good solution. If we're not willing to analyze problems, we'll be left with a different set of problems after switching to a different software.

MaxKellermann commented 2 years ago

journalctl -o short-monotonic on my OpenVario:

[    5.516349] openvario-7-PQ070 systemd-networkd[141]: eth0: Gained carrier
[    5.574987] openvario-7-PQ070 systemd-networkd[141]: eth0: DHCPv4 address 172.28.0.212/24 via 172.28.0.1
[    5.587869] openvario-7-PQ070 systemd-timesyncd[169]: Network configuration changed, trying to establish connection.
[    5.620774] openvario-7-PQ070 systemd-timesyncd[169]: Initial synchronization to time server 216.239.35.0:123 (time1.google.com).
[    5.621383] openvario-7-PQ070 systemd-resolved[168]: Clock change detected. Flushing caches.

It took networkd 58ms to obtain a DHCP IP address, and 10ms later, timesyncd contacted the NTP server time1.google.com. The clock was full synchronized with that NTP server 33ms after that. That's 105ms for everything, including DHCP and NTP. That's pretty good, I think.

How does it look on your OpenVario?

lordfolken commented 2 years ago

so i reflashed with current master.

Date on the console shows 10 November 2021.

[ 40.168159] openvario-57-lvds systemd-networkd[202]: wlan0: Gained carrier [ 40.149492] openvario-57-lvds systemd-timesyncd[185]: No network connectivity, watching for changes.

[ 2413.968650] openvario-57-lvds systemd[1]: Starting Time & Date Service... [ 2414.411420] openvario-57-lvds systemd[1431]: systemd-timedated.service: ProtectHostname=yes is configured, but the kernel does not support UTS namespaces, ignoring namespace setup. [ 2414.442133] openvario-57-lvds dbus-daemon[191]: [system] Successfully activated service 'org.freedesktop.timedate1' [ 2414.451182] openvario-57-lvds systemd[1]: Started Time & Date Service. [ 2495.970569] openvario-57-lvds systemd[1]: systemd-timedated.service: Deactivated successfully. root@openvario-57-lvds:~# timedatectl Local time: Wed 2021-11-10 14:02:22 UTC Universal time: Wed 2021-11-10 14:02:22 UTC RTC time: Thu 1970-01-01 01:02:09 Time zone: UTC (UTC, +0000) System clock synchronized: no NTP service: active Time local TZ: no so at least without reboot, it doesn't seem to synchronize at all. Potentially its a slewing problem, as the system time is so far back.
lordfolken commented 2 years ago

so after a reboot the time is synced correctly. ` [ 36.141314] openvario-57-lvds systemd-timesyncd[180]: Initial synchronization to time server 216.239.35.8:123 (time3.google.com). [ 36.830218] openvario-57-lvds systemd[1]: systemd-hostnamed.service: Deactivated successfully. [ 45.694932] openvario-57-lvds systemd[1]: Created slice Slice /system/dropbear. [ 45.700635] openvario-57-lvds systemd[1]: Condition check resulted in SSH Key Generation being skipped. [ 45.720165] openvario-57-lvds systemd[1]: Started SSH Per-Connection Server (10.21.30.122:42322). [ 45.752618] openvario-57-lvds dropbear[237]: Child connection from 10.21.30.122:42322 [ 47.181797] openvario-57-lvds dropbear[237]: Auth succeeded with blank password for 'root' from 10.21.30.122:42322 [ 84.213944] openvario-57-lvds connmand[195]: wlan0 {del} route 82.165.8.211 gw 10.21.30.254 scope 0 ` i' m just puzzeld why it has connectivity before the route...

MaxKellermann commented 2 years ago

That's because connman doesn't notify timesyncd that a network connection has been established. So timesyncd never attempts to contact a NTP server. If you want to use connman, I guess we have to find a way for connman to trigger timesyncd. Does connman have some kind of hooks for that?

kedder commented 2 years ago

That's a regression from recent changes if that's the case. Time is synched in current stable version.

MaxKellermann commented 2 years ago

That's a regression from recent changes if that's the case. Time is synched in current stable version.

You said that already, but this doesn't help solve the problem.

MaxKellermann commented 2 years ago

I read some connman source code and found out it has its own NTP client. This means that systemd-timesyncd's "unwillingness" to start working is not relevant here, beacuse systemd-timesyncd is not responsible for doing NTP in your setup.

So whatever the problem is, you should probably start looking at connman. (Not my expertise, I'm out - I suggest not using connman at all.)

lordfolken commented 2 years ago

https://github.com/aldebaran/connman/blob/master/doc/overview-api.txt there seems to be dbus integration.

mihu-ov commented 2 years ago

Just to mention it, most "normal" users won´t have network access on their OV and time will be provided by GPS data.

lordfolken commented 2 years ago

so adding "TimeUpdates=manual" to /etc/connman/main.conf (doesnt exist, yet) disables the ntpclient inside connman. At least now i'm getting a time synced yes in timedatectl.

I will try to build an image to do so.

MaxKellermann commented 2 years ago

Just to mention it, most "normal" users won´t have network access on their OV and time will be provided by GPS data.

GPS time won't be copied to the system clock, and thus doesn't help with the certificate validation problem.

(Of course, if you don't have an internet connection, you won't have certificate validation problems, because you don't download anything.)

so adding "TimeUpdates=manual" to /etc/connman/main.conf (doesnt exist, yet) disables the ntpclient inside connman.

How does disabling connman's NTP client solve the problem? The problem is that apparently connman does not properly use NTP.

Any solution that involves timesyncd doesn't work with connman, unless you do something to ensure that timesyncd gets notified when connman establishes a network connection, which you did not.

lordfolken commented 2 years ago

Just to mention it, most "normal" users won´t have network access on their OV and time will be provided by GPS data.

maybe i'm not a normal user, but i use skylines tracking, metar info, feed ogn data to ov, xcsoar-cloud and hope to down/upload flights and tasks.

lordfolken commented 2 years ago

GPS time won't be copied to the system clock, and thus doesn't help with the certificate validation problem.

And why not? (decent not connman/timesyncd) NTP clients are capable of using NMEA for time synchronization.

(Of course, if you don't have an internet connection, you won't have certificate validation problems, because you don't download anything.)

so adding "TimeUpdates=manual" to /etc/connman/main.conf (doesnt exist, yet) disables the ntpclient inside connman.

How does disabling connman's NTP client solve the problem? The problem is that apparently connman does not properly use NTP.

It works perfectly. its just not configured. If you launch it in debug mode (-d) you will see that it lacks a timeserver.

As far as i gather, it expects to be given ntp servers. Either via dhcp option or via manual configuration in connmanctl configure <servicename> --timeservers <foo>

The 3rd option seems to be the

[General]
FallbackTimeservers=pool.ntp.org

option in /etc/connman/main.conf

There is a clock command in connmanctl that shows you the state. (timeservers remains emtpy there)

I discovered that even with poweroff the cubieboard board keeps the time. One really needs to remove the power and wait a few seconds.

So with a fresh image, a masked/stopped systemd-timesyncd and the above conf file:

Nov 10 13:00:39 openvario-57-lvds connmand[192]: ../connman-1.40/src/timeserver.c:sync_next() Using timeserver 84.16.67.12
Nov 10 13:00:39 openvario-57-lvds connmand[192]: ../connman-1.40/src/ntp.c:start_ntp() server 84.16.67.12 family 2
Nov 10 13:00:39 openvario-57-lvds connmand[192]: ../connman-1.40/src/clock.c:get_properties() conn 0x51be20
Nov 10 13:00:40 openvario-57-lvds connmand[192]: ../connman-1.40/src/ntp.c:decode_msg() flags      : 0x24
Nov 10 13:00:40 openvario-57-lvds connmand[192]: ../connman-1.40/src/ntp.c:decode_msg() stratum    : 1
Nov 10 13:00:40 openvario-57-lvds connmand[192]: ../connman-1.40/src/ntp.c:decode_msg() poll       : 1024.000000 seconds (10)
Nov 10 13:00:40 openvario-57-lvds connmand[192]: ../connman-1.40/src/ntp.c:decode_msg() precision  : 0.000000 seconds (-25)
Nov 10 13:00:40 openvario-57-lvds connmand[192]: ../connman-1.40/src/ntp.c:decode_msg() root delay : 0 seconds (fraction 256)
Nov 10 13:00:40 openvario-57-lvds connmand[192]: ../connman-1.40/src/ntp.c:decode_msg() root disp. : 0 seconds (fraction 768)
Nov 10 13:00:40 openvario-57-lvds connmand[192]: ../connman-1.40/src/ntp.c:decode_msg() reference  : 0x535047
Nov 10 13:00:40 openvario-57-lvds connmand[192]: ../connman-1.40/src/ntp.c:decode_msg() org=3845538039.772749 rec=3854119761.760705 xmt=3854119761.760718 dst=3845538040.089550
Nov 10 13:00:40 openvario-57-lvds connmand[192]: ../connman-1.40/src/ntp.c:decode_msg() offset=8581721.829562 delay=0.316788
Nov 10 13:00:40 openvario-57-lvds connmand[192]: ../connman-1.40/src/ntp.c:decode_msg() Timeserver 84.16.67.12, next sync in 1024 seconds
Nov 10 13:00:40 openvario-57-lvds connmand[192]: ntp: adjust (jump): +8581721.829562 sec
Feb 17 20:49:21 openvario-57-lvds connmand[192]: ../connman-1.40/src/ntp.c:decode_msg() interval/delta/delay/drift 1024.000000s/+8581721.830s/0.317s/+0ppm**

Any solution that involves timesyncd doesn't work with connman, unless you do something to ensure that timesyncd gets notified when connman establishes a network connection, which you did not.

I was under the impression that they both talk to dbus. However i found this compilation option: --enable-nmcompat

Enable support for NetworkManager compatibility interfaces

This allows to expose a minimal set of NetworkManager
interfaces. It is useful for systems with applications
written to use NetworkManager to detect online/offline
status and have not yet been converted to use ConnMan.

timesyncd should for sure trigger on the NetworkManager dbus objects.

Ps:

linuxianer99 commented 2 years ago

@lordfolken : Is this issue still valid ??