raspberrypi / linux

Kernel source tree for Raspberry Pi-provided kernel builds. Issues unrelated to the linux kernel should be posted on the community forum at https://forums.raspberrypi.com/
Other
11.07k stars 4.97k forks source link

PPS source have timeout since update on March 17th #5430

Closed Etifloyd closed 1 year ago

Etifloyd commented 1 year ago

Describe the bug

For more than one year I'm running chrony with a Waveshare NEO-M8T GPS/PPS module to have a stratum 1 NTP server. It was running correctly until I let Raspberry PI OS make a big update on March 17th. Since chrony is no more able to synchronize with the PPS signal. Running the following command: "sudo ppstest /dev/pps0" give time-out errors. To be sure that the GPS module is still ok, I restore a SD card with an image of december and it runs correctly.

Steps to reproduce the behaviour

Create an SD card with the last version of Raspberry Pi imager and apply all updates. Update the /boot/config.txt (see attachement). Install the following packages: chrony, gpsd, gpsd-clients, gpsmon, pps-tools. Update the following files: /etc/chrony/chrony.conf (see attachement) and /etc/default/gpsd (see attachement). Stop and disable the systemd-timesyncd.service. Enable chrony and gpsd services. Reboot the raspberry. Once rebooted the "sudo ppstest /dev/pps0" command shall give timeout errors. And the "chronyc sources" command shall show 0 reach for the PPS0 source. RPi_config.zip

Device (s)

Raspberry Pi 4 Mod. B

System

Raspberry Pi reference 2022-04-04 Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 226b479f8d32919c9fe36dd5b4c20c02682f8180, stage4

Mar 17 2023 10:50:39 Copyright (c) 2012 Broadcom version 82f3750a65fadae9a38077e3c2e217ad158c8d54 (clean) (release) (start)

Linux ntpserver 6.1.21-v8+ #1642 SMP PREEMPT Mon Apr 3 17:24:16 BST 2023 aarch64 GNU/Linux

Logs

Here are my update logs from Feb. 14th when everything was running correctly and Mar. 17th where problem starts: UpdateLog.zip

Additional context

No response

pelwell commented 1 year ago

Is that a 32-bit image you are using? And was it previously running a 32-bit kernel? If so, and if you still have (or can easily recreate) the failing image, try adding arm_64bit=0 to config.txt.

Etifloyd commented 1 year ago

Yes, it a 32 bits image that I'm using. I use the default image proposed in the imager: 32 bits with desktop. I add arm_64bit=0 in the config.txt file and it did the trick, now it's working correctly! Thank you for the quick answer and the trick! Kind regards

pelwell commented 1 year ago

Cool - consider that a workaround for now, at least until we understand the problem. I've a horrible feeling its a problem with the ioctl() interface from the 32-bit userland to the 64-bit kernel...

pelwell commented 1 year ago

My configuration is:

If I launch ppstest without the fake PPS source I see:

pi@raspberrypi:~$ sudo ppstest /dev/pps0
trying PPS source "/dev/pps0"
found PPS source "/dev/pps0"
ok, found 1 source(s), now start fetching data...
[ nothing happens ]
^C

With the fake source:

pi@raspberrypi:~$ sudo ppstest /dev/pps0
trying PPS source "/dev/pps0"
found PPS source "/dev/pps0"
ok, found 1 source(s), now start fetching data...
source 0 - assert 1681896676.611779347, sequence: 419 - clear  0.000000000, sequence: 0
source 0 - assert 1681896677.633878166, sequence: 420 - clear  0.000000000, sequence: 0
source 0 - assert 1681896678.655616653, sequence: 421 - clear  0.000000000, sequence: 0
source 0 - assert 1681896679.677251561, sequence: 422 - clear  0.000000000, sequence: 0
...

I've not seen anything like a timeout.

Can you run the following test?

  1. Disconnect your GPS module, and add a link/patch cable between GPIOs 17 and 18 (pins 11 and 12)
  2. I'm assuming you already have dtoverlay=pps-gpio in config.txt, and that the default GPIO of 18 is being used.
  3. sudo apt install gpiod pps-tools
  4. In a shell run the following:
    $ while true; do gpioset gpiochip0 17=1; gpioset gpiochip0 17=0; sleep 1; done &
    $ sudo ppstest /dev/pps0

    What output do you get?

pelwell commented 1 year ago
  1. If you remove the link or kill the background task (probably kill %1, or fg and Ctrl-C), does the output just stop or do you get a timeout?
Etifloyd commented 1 year ago

So to be back in the faulty situation I comment the "arm_64bit=0" line in the config.txt file, I shutdown the pi, remove the gps module, put a jumper between pin 11 and 12 and reboot the pi I launch the while loop and the pps test output is the following: trying PPS source "/dev/pps0" found PPS source "/dev/pps0" ok, found 1 source(s), now start fetching data... source 0 - assert 1681899996.974862821, sequence: 11 - clear 0.000000000, sequence: 0 source 0 - assert 1681899998.012749800, sequence: 12 - clear 0.000000000, sequence: 0 source 0 - assert 1681899999.051649106, sequence: 13 - clear 0.000000000, sequence: 0 source 0 - assert 1681900000.092132863, sequence: 14 - clear 0.000000000, sequence: 0

When I remove the jumper, the stream simply stops, no timeout.

pelwell commented 1 year ago

Interesting. What is the minimum you need to do to get ppstest to fail with a timeout? Obviously you won't be running the while loop, but what else?

Etifloyd commented 1 year ago

I put the gps module back on the pi and now I have timeout. The two other stuff I'm running on this Pi is NUT to monitor an UPS and Cacti to monitor network devices through SNMP. Both packages installed with apt-get. Globally, my raspberry do almost nothing. It's look like the timeout appear when the gps module is present. May it be releated to GPSD?

Etifloyd commented 1 year ago

I also try the while loop with a sleep of 0.9s and no timeout

Etifloyd commented 1 year ago

I try to reproduce the problem with a Pi 3B+ which also have a 64 bits core, but no timeout for the moment. I use jumper wires to reroute TX and PPS signal from the Pi4 to the Pi3B+. I have no timeout and chrony does sync correctly. Maybe I run bullseye from too long time on my Pi4 and something went different over all the regular updates regarding a freshly created image. May it be a reason?

pelwell commented 1 year ago

Could you install a fresh Raspberry Pi OS on spare SD card, then do just enough to run ppstest with the HAT attached?

Etifloyd commented 1 year ago

I just create a new SD card with default Raspberry Pi OS 32 bits of Feb. the 21st with imager version 1.7.2. I apply all the updates and enable the UART, SSH and VNC from Raspi config. After reboot, I update the config.txt file to disable wifi and bt and enable the pps on pin 18. After the reboot, I install pps-tools, gpsd and gpsd-clients. In /etc/default/gpsd file I set the device as following: DEVICES="/dev/serial0 /dev/pps0" I restart the gpsd service and launch the gpsmon command. Both NMEA and PPS data are visible. Then I launch the sudo ppstest /dev/pps0 command, and ... I have timeout

pelwell commented 1 year ago

Thanks - that's useful (but annoying).

Does stopping the gpsmon service prevent the timeouts? I won't be surprised if you don't get any pulses back.

Etifloyd commented 1 year ago

No, I just stop and disable both gpsd.service and gpsd.socket, no difference. Timeout is still present.

Etifloyd commented 1 year ago

What is strange is that the gpsmon command does see the pps pulses, but the ppstest command see timeout.

Timi7007 commented 1 year ago

Hello,

I'm also running a NTP server on a Raspberry Pi 4 (4 GB) alongside some network monitoring and NUT. GPS module used here is a Gonnely NEO-6M board connected to GPIO. I've had unattended upgrades running forever and recently rebooted the Pi causing me to have the same issues.

gpsmon returned a value for PPS, ppstest timeouts. I added arm_64bit=0 to my /boot/config.txt and rebooted, now both agree on PPS again and chrony doesn't scream at me anymore for having the wrong time.

chrony's log /var/log/chrony/statistics.log actually shows when the system broke (aka I rebooted, allowing the unattended upgrade to become the running system): image You can clearly see the usual back-and-forth between PPS and GPS, then use of other NTP servers as reference since the GPS takes a moment to get up after a reboot and then the issue surfaces: It only ever goes to GPS, occasionally NTP when GPS is too slow, never PPS. image After the config-change you recommended I'm now back to usual business: image

Another symptom was chronyc sources -v showing the whole GPS line, no data - especially no LastRx - for PPS and selecting a NTP server as time-source. I noticed the issue as my NTP-server-score on ntppool.org had degraded, as I was pulling from Stratum 1 servers instead of my own Stratum 0 source.

For my original setup I mostly followed https://austinsnerdythings.com/2021/04/19/microsecond-accurate-ntp-with-a-raspberry-pi-and-pps-gps/ and my /boot/config.txt currently ends in:

# GPS PPS signals
dtoverlay=pps-gpio,gpiopin=18
init_uart_baud=115200
arm_64bit=0

If I can provide any further information to fix this bug, please let me know.

pelwell commented 1 year ago

I'm not sure what's changed, but with a Pico as a fake PPS source (5 lines of MicroPython!) I'm now getting timeouts from pps-test on the 64-bit kernel, while it works as expected on the 32-bit kernel.

pelwell commented 1 year ago

See #5478 for a probable fix. Once the check builds have completed (in about an hour), run sudo rpi-update pulls/5478 to install a trial kernel.

Etifloyd commented 1 year ago

Hi Phil,

Thank you for the update. I would like to try, but my Pi is in production and I'm still waiting to buy another Pi 4 to make some tests. But the market is very short actually, the only Pi 4 I found is about 200$... Too expensive to make some test.

popcornmix commented 1 year ago

The situation is improving. Keep an eye on rpilocator. Many official resellers (who won't mark up the price) have had stock in the last week or two.

Etifloyd commented 1 year ago

Dear all,

Finally, I was able to buy a new Pi4. I first comment the "arm_64bit=0" line in the config.txt file and reboot the Pi. At that step, I had again the pps timeout. Secondly, I made the kernel update proposed by Phil with the "sudo rpi-update pulls/5478" command and reboot the Pi. The pps is now working correctly and Chrony can sync to the GPS/PPS at stratum level 0 as expected!

Thank you for the correction!

Kind regards, Etienne

Nebulosa-Cat commented 11 months ago

請參閱#5478以了解可能的修復方法。檢查建置完成後(大約一個小時),運行sudo rpi-update pulls/5478以安裝試用內核。

is this fix already on 6.1.61-v8+? (Pi4b) still got timeout and sudo rpi-update pulls/5478 not work anymore

pelwell commented 11 months ago

Yes - the fix is still there: https://github.com/raspberrypi/linux/blob/rpi-6.1.y/drivers/pps/pps.c#L255

pelwell commented 11 months ago

It's working for me on a Pi 5 with a Pico as a fake PPS source - using the Micropython firmware, with Pico GPIO 13 connected to Pi GPIO 18, any Pico GND connected to any Pi GND, and dtoverlay=pps-gpio,gpiopin=18 in config.txt, type the following at the Pico serial console:

from machine import Pin
import time
pps = Pin(13, Pin.OUT)
while 1:
    pps.high()
    pps.low()
    time.sleep(1)

Which should then allow:

pi@raspberrypi:~$ sudo ppstest /dev/pps0
trying PPS source "/dev/pps0"
found PPS source "/dev/pps0"
ok, found 1 source(s), now start fetching data...
source 0 - assert 1699374899.440174342, sequence: 445 - clear  0.000000000, sequence: 0
source 0 - assert 1699374900.440225435, sequence: 446 - clear  0.000000000, sequence: 0
...

On a fairly recent system (or after building it yourself from https://github.com/raspberrypi/utils/tree/master/pinctrl) you can verify that the Pi pin is changing with:

$ pinctrl poll 18
18: lo // PIN12/GPIO18
+714121us
18: hi // PIN12/GPIO18
18: lo // PIN12/GPIO18
+1000014us
18: hi // PIN12/GPIO18
18: lo // PIN12/GPIO18
+1000014us
18: hi // PIN12/GPIO18
18: lo // PIN12/GPIO18
+1000018us
18: hi // PIN12/GPIO18
18: lo // PIN12/GPIO18

(You may need to sudo the pinctrl command if you don't have the right udev rule installed)

Nebulosa-Cat commented 11 months ago

It's working for me on a Pi 5 with a Pico as a fake PPS source - using the Micropython firmware, with Pico GPIO 13 connected to Pi GPIO 18, any Pico GND connected to any Pi GND, and dtoverlay=pps-gpio,gpiopin=18 in config.txt, type the following at the Pico serial console:

from machine import Pin
import time
pps = Pin(13, Pin.OUT)
while 1:
    pps.high()
    pps.low()
    time.sleep(1)

Which should then allow:

pi@raspberrypi:~$ sudo ppstest /dev/pps0
trying PPS source "/dev/pps0"
found PPS source "/dev/pps0"
ok, found 1 source(s), now start fetching data...
source 0 - assert 1699374899.440174342, sequence: 445 - clear  0.000000000, sequence: 0
source 0 - assert 1699374900.440225435, sequence: 446 - clear  0.000000000, sequence: 0
...

On a fairly recent system (or after building it yourself from https://github.com/raspberrypi/utils/tree/master/pinctrl) you can verify that the Pi pin is changing with:

$ pinctrl poll 18
18: lo // PIN12/GPIO18
+714121us
18: hi // PIN12/GPIO18
18: lo // PIN12/GPIO18
+1000014us
18: hi // PIN12/GPIO18
18: lo // PIN12/GPIO18
+1000014us
18: hi // PIN12/GPIO18
18: lo // PIN12/GPIO18
+1000018us
18: hi // PIN12/GPIO18
18: lo // PIN12/GPIO18

(You may need to sudo the pinctrl command if you don't have the right udev rule installed)

thanks, i'll check it. did the pps output only work when gps got 2d or 3d fix? (M8N) if it is i think maybe my problem cause by unstable gps single ?

beta-tester commented 11 months ago

did the pps output only work when gps got 2d or 3d fix? (M8N)

depends on the settings and your device.

some GPS/GNSS devices can be configured to generate PPS already from internal clock even there is no fix. here a Quectel L96 with MTK chipset

Lx0&Lx6&LC86L&LG77L_GNSS_Protocol_Specification v2.1 2.3.18. PMTK285 PMTK_SET_PPS_CONFIG Sets PPS type ... L96

danktankk commented 10 months ago

Yes - the fix is still there: https://github.com/raspberrypi/linux/blob/rpi-6.1.y/drivers/pps/pps.c#L255

How do I implement this fix? Sorry but I havent ever had to do anything like this before. I am using a Pi Zero WH armv61. Thanks for any help. This issue has been pretty annoying.

phipac commented 3 weeks ago

Good day, all. May I ask if/when this was committed to the Pi/Debian kernel? I just started experiencing this problem (ppstest shows PPS packets fine, but chrony will not see them) a few weeks ago after a kernel update. I just now installed 6.6.47+rpt-rpi-v8 and I am still experiencing the same issue.

$ sudo ppstest /dev/pps0
trying PPS source "/dev/pps0"
found PPS source "/dev/pps0"
ok, found 1 source(s), now start fetching data...
source 0 - assert 1725647160.003104468, sequence: 343 - clear  0.000000000, sequence: 0
source 0 - assert 1725647161.003102557, sequence: 344 - clear  0.000000000, sequence: 0

etc. I ran it for 5 minutes with no skipped sequences.

$ chronyc sources
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#? GPS                           0   8     3   237   -331ms[ -330ms] +/-  200ms
#? PPS                           0   4     0     -     +0ns[   +0ns] +/-    0ns
^? ntp.he.net                    0   5     0     -     +0ns[   +0ns] +/-    0ns
^+ clock.fmt.he.net              1   5   377    58   -531us[ -531us] +/-   10ms
^* clock.fmt.he.net              1   5   377   157   -309us[  -72us] +/- 9513us
^x clock.sjc.he.net              5   5   377    23   -129ms[ -129ms] +/- 8594us
^+ clock.fmt.he.net              2   5   377    57  -1165us[-1165us] +/-   11ms
^+ clock.fmt.he.net              2   5   377    26  -1712us[-1712us] +/-   12ms
^+ time.cloudflare.com           3   5   377    24  -1198us[-1198us] +/-   21ms
^+ time.cloudflare.com           3   5   377    89   +638us[ +638us] +/-   19ms
^+ time.cloudflare.com           3   5   377    56  -1218us[-1218us] +/-   21ms
^+ time.cloudflare.com           3   5   377    55  -1883us[-1883us] +/-   22ms
phipac commented 2 weeks ago

Nevermind. It just started working again. Not sure why!

Good day, all. May I ask if/when this was committed to the Pi/Debian kernel? I just started experiencing this problem (ppstest shows PPS packets fine, but chrony will not see them) a few weeks ago after a kernel update. I just now installed 6.6.47+rpt-rpi-v8 and I am still experiencing the same issue.

$ sudo ppstest /dev/pps0
trying PPS source "/dev/pps0"
found PPS source "/dev/pps0"
ok, found 1 source(s), now start fetching data...
source 0 - assert 1725647160.003104468, sequence: 343 - clear  0.000000000, sequence: 0
source 0 - assert 1725647161.003102557, sequence: 344 - clear  0.000000000, sequence: 0

etc. I ran it for 5 minutes with no skipped sequences.

$ chronyc sources
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#? GPS                           0   8     3   237   -331ms[ -330ms] +/-  200ms
#? PPS                           0   4     0     -     +0ns[   +0ns] +/-    0ns
^? ntp.he.net                    0   5     0     -     +0ns[   +0ns] +/-    0ns
^+ clock.fmt.he.net              1   5   377    58   -531us[ -531us] +/-   10ms
^* clock.fmt.he.net              1   5   377   157   -309us[  -72us] +/- 9513us
^x clock.sjc.he.net              5   5   377    23   -129ms[ -129ms] +/- 8594us
^+ clock.fmt.he.net              2   5   377    57  -1165us[-1165us] +/-   11ms
^+ clock.fmt.he.net              2   5   377    26  -1712us[-1712us] +/-   12ms
^+ time.cloudflare.com           3   5   377    24  -1198us[-1198us] +/-   21ms
^+ time.cloudflare.com           3   5   377    89   +638us[ +638us] +/-   19ms
^+ time.cloudflare.com           3   5   377    56  -1218us[-1218us] +/-   21ms
^+ time.cloudflare.com           3   5   377    55  -1883us[-1883us] +/-   22ms
webdeck commented 10 hours ago

When I upgraded my pi4 to bullseye I started experiencing this issue. I had to add "arm_64bit=0" to /boot/config.txt to get PPS working again. Has the fix been rolled out yet? I am on kernel 6.1.21-v7l+.

popcornmix commented 6 hours ago

Bullseye only gets critical security updates - not general bug fixes or new features. If you update to bookworm, you will get a 6.6 kernel with regular updates.