Arksine / moonraker

Web API Server for Klipper
https://moonraker.readthedocs.io
GNU General Public License v3.0
1.02k stars 392 forks source link

Ensure time is synced before refreshing the update manager status. #830

Open miklschmidt opened 3 months ago

miklschmidt commented 3 months ago

Is your feature request related to a problem? Please describe

From my analysis of a few cases of "corrupt" repositories (they're not actually corrupt) on fresh sd cards, it looks like time synchronization tend to happen right in the middle of the update manager status check process, which causes errors when a user boots a pi that hasn't been running for a while.

Notice how git fetch fails (i'm assuming SSL errors because of the time discrepancy) and subsequent repair attempts fail (because there are in fact no empty .git/objects files so rm errors out) and how the time is synced afterwards:

Mar 01 15:26:37 vc3-200 python[605]: [shell_command.py:_check_proc_success()] - Command (git -C /home/pi/printer_data/config/RatOS remote get-url origin) successfully finished
Mar 01 15:26:38 vc3-200 python[605]: [shell_command.py:_check_proc_success()] - Command (git -C /home/pi/printer_data/config/RatOS fetch origin --prune --progress) exited with return code 128
Mar 13 11:19:57 vc3-200 systemd-timesyncd[281]: Initial synchronization to time server 195.171.43.12:123 (2.debian.pool.ntp.org).
Mar 13 11:19:57 vc3-200 python[605]: [shell_command.py:_check_proc_success()] - Command (find .git/objects/ -type f -empty | xargs rm) exited with return code 1
Mar 13 11:19:57 vc3-200 systemd[1]: Starting Online ext4 Metadata Check for All Filesystems...

Repositories checked after time has been synced, complete the update checks just fine.

Describe the solution you'd like

There's a few solutions, i guess: 1) Make the moonraker service specify After=time-sync.target and ask the user to enable systemd-time-wait-sync.service if their printer is assumed to be connected to the internet. Not ideal as that makes your printer network dependent. 2) Implement a separate service to trigger the update status checks, that depends on time-sync.target and ask users to enable systemd-time-wait-sync.service. 3) Patch the update manager so that it ensures time is synced before checking repository status. This seems like the most robust solution, and would allow us to abort the process early with a clear error cause in case time couldn't be synced.

Describe alternatives you've considered

Assuming a corrupted repository if a git command fails might be misleading, there could be many causes for the current checks to fail, maybe improve that instead? The problem with unnecessarily reporting "corrupted" is that it prompts users to replace their sd cards and basically start over with their setup, that seems like a lot of wasted effort, at least if it's just caused by a missing dependency on time synchronization :D

Additional information

No response

miklschmidt commented 1 month ago

It might also be a good idea to not consider a repository corrupted just because github isn't reachable.