MichaIng / DietPi

Lightweight justice for your single-board computer!
https://dietpi.com/
GNU General Public License v2.0
4.8k stars 494 forks source link

Setting CONFIG_NTP_MODE=0 (in order to then install chrony) makes initial install fail #6986

Open dirkhh opened 5 months ago

dirkhh commented 5 months ago

Creating a bug report/issue

Required Information

Additional Information (if applicable)

Steps to reproduce

  1. we want to run chrony for better time sync when tracking ADS-B planes
  2. setting CONFIG_NTP_MODE=0 in dietpi.txt (so that we can install chrony in the Automation_Custom_Script.sh)
  3. first boot will fail with various strange error

Expected behaviour

Actual behaviour

Mentions

@wiedehopf

MichaIng commented 5 months ago

Actually the time sync mode configured in dietpi.txt should be applied after the initial update and after in case automated installs did finish. Without moreless correct system time (so that HTTPS succeeds), it would fail already at the dietpi-update check. So it should always sync the time at least once with systemd-logind at early boot, once network has been configured.

At which stage exactly did it fail? Otherwise, I suggest to leave NTP mode untouched for first boot and then apply it with the automation script. You can use this command after chrony was installed:

/boot/dietpi/func/dietpi-set_software ntpd-mode 0
wiedehopf commented 5 months ago

To reproduce minimally i loaded the image fresh from the dietpi website: DietPi_RPi-ARMv8-Bookworm.img.xz

No worries about the password in the config, it's public / randomized on build: dietpi.txt

ssh into the image and it shows this:

image image

Choosing take over and cancel to investigate the state of the image:

root@adsb-feeder:~# cat /boot/dietpi/.version
G_DIETPI_VERSION_CORE=9
G_DIETPI_VERSION_SUB=1
G_DIETPI_VERSION_RC=1
G_GITBRANCH='master'
G_GITOWNER='MichaIng'

The logs don't seem to be very useful:

root@adsb-feeder:/var/tmp/dietpi/logs# ls
dietpi-firstboot.log  dietpi-ramlog.log  fs_partition_resize.log

dietpi-firstboot.log: http://sprunge.us/N7dVni dietpi-ramlog.log: http://sprunge.us/FGRdL2 fs_partition_resize.log: http://sprunge.us/7UMRUC

Anything else in regards to logs i can get you? I'm pretty sure you could reproduce with the above but i'm happy to investigate further.

Edit: journalctl: http://sprunge.us/ovZodp

wiedehopf commented 5 months ago

At which stage exactly did it fail? Otherwise, I suggest to leave NTP mode untouched for first boot and then apply it with the automation script. You can use this command after chrony was installed:

That will be a perfectly good work-around for the usecase in question. (I've already tested a similar workaround just using sed to modify the config because i wasn't aware of the mentioned command, worked perfectly fine)

MichaIng commented 5 months ago

The early setup steps all succeded. Is there some /var/log/dietpi-update.log or /var/tmp/dietpi/logs/dietpi-firstrun-setup.log?

wiedehopf commented 5 months ago

Not that i can see:

root@DietPi:~# ls /var/log/
README  btmp  lastlog  private  wtmp
root@DietPi:~# ls /var/tmp/dietpi/logs/
dietpi-firstboot.log  dietpi-ramlog.log  fs_partition_resize.log
wiedehopf commented 5 months ago

dietpi-update stage 0 log: http://sprunge.us/oiBnJG

To get the log i added logging to this line: https://github.com/MichaIng/DietPi/blob/master/dietpi/dietpi-login#L188 (Probably good to add logging for that regardless of this issue)

systemd-timesyncd systemd service seems to only runs after that first dietpi-update (it does run as the journal shows and it hasn't yet run because the time isn't set yet) In other NTP modes dietpi-update specifically starts systemd-timesyncd via run_ntpd when starting but for NTP=0 it does not.

wiedehopf commented 5 months ago

Yeah that approach didn't work because it doesn't wait for network / wait for timesync success. (the commit above that references the issue that wasn't really ready)

It's quite tricky to add some sort of exception for $G_DIETPI_INSTALL_STAGE == 0 into dietpi/func/run_ntpd I don't have a good idea for it and it's probably not that important.

In regards to the logging i've tested some simple changes and made a PR but feel free to just close the PR and add logging as you see fit :) https://github.com/MichaIng/DietPi/pull/6988

MichaIng commented 5 months ago

Weird: In the journalctl log it shows that the time was correctly set:

systemd-timesyncd[297]: Initial clock synchronization to Tue 2024-03-26 09:36:01.879213 GMT.

But during the APT update, it is at an older time. As if it somehow got set back later.

EDIT: Ah right, because it is already set to mode 0, it is not waiting for the time sync. Okay indeed this is not optimal: The systemd-timesyncd service is disabled later after dietpi-software ran, bit run_ntpd does not wait for the time sync already before that. Would be better to have it reading the mode elsewhere, so that the effective setting always matches the state of thr systemd unit and is only set together with dietpi-set_software ntpd-mode after first run setup has otherwise finished. But I do not like to have multiple flags. I have to think about how to do this best.

But in general, it actually also makes sense to set the time sync mode to 0 only after an additional time sync daemon has been installed, not before. Probably best is to better/clearer split dietpi.txt into sections: One for things which are intended to be set before first boot to be applied during first run setup automatically, and one which contains only flags for things which are not fully effective alone, or cause issues when changing them directly. Such could also be put into a dedicated file.

For this particular case we could also force the updater to wait for timesyncd only during first boot. Just as you suggested.

wiedehopf commented 5 months ago

Probably best is to better/clearer split dietpi.txt into sections: One for things which are intended to be set before first boot to be applied during first run setup automatically, and one which contains only flags for things which are not fully effective alone, or cause issues when changing them directly. Such could also be put into a dedicated file.

Well the issue with this option is that settings 1-4 are perfectly valid to set before first boot, just option 0 is not. So possibly just a remark that you shouldn't set 0 before first boot is complete? That of course makes the document longer but maybe that's the way?