MichaIng / DietPi

Lightweight justice for your single-board computer!
https://dietpi.com/
GNU General Public License v2.0
4.81k stars 494 forks source link

Raspberry Pi | Slow system clock with arm_freq_min < 300 #4455

Open C0D3-M4513R opened 3 years ago

C0D3-M4513R commented 3 years ago

Creating a bug report/issue

Required Information

Additional Information (if applicable)

Steps to reproduce

  1. Run the Command dietpi-config
  2. Go into Advanced Options
  3. Go to Tims sync mode
  4. Try modes 1-3 and wait a couple days? (I am using the mode 3)
  5. Watch the time drift hours!

Expected behaviour

Actual behaviour

Manual Fix

Run systemctl start systemd-timesyncd.service

Automatic Fix

Create a systemd timer, that automatically starts systemd-timesynced.service

Extra details

I am currently keeping the pi up 24/7. I am right now at a uptime of only 16h, because of a rpi-eeprom update.

MichaIng commented 3 years ago

Many thanks for your report. Obviously the hourly time sync fails in your case. To check what's going on with that:

journalctl -u systemd-timesyncd
C0D3-M4513R commented 3 years ago

It is?

log.txt

MichaIng commented 3 years ago

You sync time with your router at 192.168.0.2 and that seems to do it's job extremely bad. The time sync runs every hour at :17 minutes system time correctly, but the sync shifts time extremely into the future then.

To rule out the source, please try to change to a public NTP server, via dietpi-config > Network Options: Misc > NTP mirror.

Or is there probably another time daemon running and conflicting, like chrony, ntp, htpdate or something like that? Generally you can check for running processes e.g. via htop.

C0D3-M4513R commented 3 years ago

I enabled timedatectl set-ntp true. Gonna disable that, and change the NTP mirror from my router to a public one.

MichaIng commented 3 years ago

timedatectl set-ntp true should basically do systemctl enable --now systemd-timesyncd. However, with time sync mode 2 or 3 enabled, on each day or hour respectively, the service is restarted and then stopped by our cron job, overriding what timedatectl set-ntp true has done 😉. Daemon mode 4 is what matches timedatectl set-ntp true to have systemd-timesyncd permanently running.

C0D3-M4513R commented 3 years ago

I would like to avoid having a daemon running. If changing the ntp mirror is the solution, then I will close this issue in a couple days. Otherwise, I will try, if this issue happens with mode 4 too, and post a message, if it does. Either way, this ticket will be stale for a couple days, if you don't mind.

C0D3-M4513R commented 3 years ago

Update: The internal clock is ticking slowly. Over a span of 15 minutes, not one minute passed on my pi. It is roughly, that 1 Minute real time = 1 second pi-time? Or not? I have NEVER encountered this before, and don't know, where to start. Is this a hw defect? Also, I query the date, via date +%H:%M:%S:%N, and it updates, almost always a second. No matter, if I query a second, or a minute apart. EDIT: doesn't do that anymore? I timed the command to roughly a minute apart:

root@DietPi:~# date +%H:%M:%S:%N
18:44:50:255859413
root@DietPi:~# date +%H:%M:%S:%N
18:44:55:389439910
# Sidenote: This should be around 19:25

Is it maybe some thing, where if the cpu clock is dynamically adjusted via schedutil, it can't keep time correctly, since the clock is always changing? Or maybe some power under voltage? I have a spare pc atx psu, I could hook up to gnd/3.3v/5v. Edit: It got worse: Pi time is 18:51:13:212329018, but it should be 20:45.

MichaIng commented 3 years ago

Okay, that is a good reason.

I run schedutil on all my devices, including RPi's and never saw something like this, but you can simply try to switch to performance and check whether it's the same.

Please check dmesg -l emerg,alert,crit,err for kernel errors, including voltage.

To check for undervoltage or temperature-related throttling, check vcgencmd get_throttled. It will printthrottled=0x0` if everything is in order, no undervoltage or overtemp detected.

C0D3-M4513R commented 3 years ago

Fine:

Bad:

NOTE: the min freq is 100 MHz. The pi should not be that swamped? no undervoltage/overtemp was detected, and dmesg only has cifs errors (which is expected) EDIT: I'm quite puzzeled. is this a issue for raspberrypi/linux?

MichaIng commented 3 years ago

Very interesting that reducing minimum frequency works in your case. Actually arm_freq_min currently should have no effect at all (it has none on my RPi2), as it caused completely handing system by times since Linux 5.4. It got a fix, but when lowering overvoltage (to a point which was stable before) it still caused crashes and issues: https://github.com/raspberrypi/firmware/issues/1431

Did you verify that it really clocked down that much? The following shows how much time the CPU(s) spent in which frequency.

cat /sys/devices/system/cpu/cpufreq/policy0/stats/time_in_state

If it indeed is possible on RPi4 to clock down to 100 MHz, then please try to reset it to 600 MHz and see if the issue persists.

C0D3-M4513R commented 3 years ago

Using energy saving oc preset, with -2 overvolt:

root@DietPi:~# cat /sys/devices/system/cpu/cpufreq/policy0/stats/time_in_state
100000 5167383
200000 228440
300000 151103
400000 56374
500000 61617
600000 201616
700000 16748
800000 84331
900000 4098
1000000 82894
1100000 6228
1200000 4590
1300000 2872
1400000 2197
1500000 2006460
root@DietPi:~# #10 min later:
root@DietPi:~# cat /sys/devices/system/cpu/cpufreq/policy0/stats/time_in_state
100000 5170933
200000 228440
300000 151103
400000 56374
500000 61617
600000 201616
700000 16748
800000 84331
900000 4098
1000000 82894
1100000 6228
1200000 4590
1300000 2872
1400000 2197
1500000 2006460
root@DietPi:~# vcgencmd measure_clock arm
frequency(48)=100037112

changed min clock, via echo 600000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq And seems to work fine.

also 300000 seems to work just fine 200000 works fine also? and 100000 breaks again also if i set 110000 it clocks to 200000 instead on 200MHz, if I measure 1 min, it is off slightly ~5-10 sec, but it is at least working 90%, rather than working only 10%

MichaIng commented 3 years ago

That is a good finding. Could you try to combine 100 MHz with 0 overvoltage? That worked for me on RPi2 with the last kernel that supported it at all (on that model).

C0D3-M4513R commented 3 years ago

works, meaning no crashes but time is still slow

Joulinar commented 3 years ago

There might be another case https://dietpi.com/phpbb/viewtopic.php?t=9217

MichaIng commented 3 years ago

Indeed looks like the same.

bbronisz commented 2 years ago

I have similar issue with schedutil but I had min frequency of 300. Testing higher values. The time didn't drift as much but when systemd-timesyncd was syncing every 32s I got around 5s gaps in Netdata charts which prompted my investigation. I can provide more information and/or verify if any solutions/workarounds are working.

Joulinar commented 2 years ago

Try to have min frequency at 600. This should avoid time drifting.

bbronisz commented 2 years ago

I can see that 400 looks better. But OK, I'll change it to 600. /Edit: and thanks for responding so quickly :).

MichaIng commented 2 years ago

Thanks for reporting. This is already the second report lately, probably something changed with a recent kernel upgrade on RPi 4 so that 300 MHz is causing such issue now as well (it was known with <300 MHz).

Ankharna commented 10 months ago

im experiencing the same issue right now at idle 300MHz. in about 20minutes it loses a few (3-5?) minutes. setting it to idle at 350MHz seems to have fixed the issue. (dietpi-config > performance options)

MichaIng commented 10 months ago

Is it the same when you switch to ondemand CPU governor?

Ankharna commented 10 months ago

Is it the same when you switch to ondemand CPU governor?

i set it back to 300MHz idle and set it to ondemand. this didnt work, the clock is too slow.

MichaIng commented 10 months ago

Okay, then it should be the same on RPi OS. Would be still nice if someone could test and verify that RPi OS is affected by the same issue. In this case, we could report it to the RPi bug tracker. Since it is about overclocking (underclocking in this case, however, both can cause issues), we might not see a fix but need to accept that not every chip shares the same upper and lower voltage and frequency limits working well, but who knows.