Closed scottmuc closed 5 months ago
This is the first repave after #68, #69, and #71. Verifying that the self-hosted Git server is working will be something to watch for. Also, as part of #72, the device will now have the hostname of pippin
, so I'll be on the lookout for anything that trips me up on that front.
Attached is the output from ssh ansible@192.168.2.10 -- "cat /etc/os-release; uname -a; dpkg -l" > state.txt
TASK [Mount music share from Windows PC] *************************************************************************
fatal: [192.168.2.102]: FAILED! => {"changed": false, "msg": "Error mounting /mnt/music: mount error(113): could n
ot connect to 192.168.2.12Unable to find suitable address.\n"}
This is because I ran a ifconfig /release
and ifconfig /renew
on the machine I'm invoking ./ansible.sh
to repave this machine. Since the device I'm repaving was responsible for giving this PC the IP of 192.168.2.12
, my PC is no longer using that IP. I also forgot to note this as a significant change since the last repave. This made a pretty strong dependency on my PC using this IP.
This SMB mount replaces the previous exFAT USB drive that was attached to the PI.
I'm not going to think about fixing this elegantly and will just this mount for now. This will result in navidrome
to fail to run, but shouldn't block the repave script from going further.
root@pippin:~# systemctl status nginx
× nginx.service - A high performance web server and a reverse proxy server
Loaded: loaded (/lib/systemd/system/nginx.service; enabled; preset: enabled)
Active: failed (Result: exit-code) since Sat 2024-06-08 19:16:38 BST; 30s ago
Docs: man:nginx(8)
Process: 2506 ExecStartPre=/usr/sbin/nginx -t -q -g daemon on; master_process on; (code=exited, status=1/FAIL>
CPU: 22ms
Jun 08 19:16:38 pippin systemd[1]: Starting nginx.service - A high performance web server and a reverse proxy ser>
Jun 08 19:16:38 pippin nginx[2506]: 2024/06/08 19:16:38 [emerg] 2506#2506: host not found in upstream "pi.home.sc>
Jun 08 19:16:38 pippin nginx[2506]: nginx: configuration file /etc/nginx/nginx.conf test failed
Jun 08 19:16:38 pippin systemd[1]: nginx.service: Control process exited, code=exited, status=1/FAILURE
Jun 08 19:16:38 pippin systemd[1]: nginx.service: Failed with result 'exit-code'.
Jun 08 19:16:38 pippin systemd[1]: Failed to start nginx.service - A high performance web server and a reverse pr>
This occurred while I was trying to bring back the smb mount.
The first run of the automation worked because nginx
started before an invalid configuration was placed in /etc/nginx/sites-enabled/home.scottmuc.com
. After the reboot, nginx
was not running due to the invalid configuration so this resource was attempting to start the service. But due to the imperative ordering in the playbook, the fix would come AFTER (the fix is in vhost tasks).
I swapped the order to get the play to complete to completion. Then swapped them back and reran to ensure the automation still functions.
The above commit was pushed to git.scottmuc.com
before syncing with GitHub! I think the self-hosted git repaved without a hitch. There's the usual clearing of ~/.ssh/known_hosts
locally, but that's pretty routine.
The urls:
no longer resolve since I've removed public resolution of private IPs (https://github.com/scottmuc/infrastructure/commit/2778c99613f6d8035e060ff29307bc2cc3d67b17). Since I haven't really finished the whole hostname migration thing, I'm swapping FQDNs to IPs for the time being. I had to update the configuration in grafana
(not in ansible, via the admin UI) as well. The next commit will have an update to the repave validation.
A bit bumpier than previous ones, but I had made significant changes to the system in the last 3 months. All the fixes were straightforward for me at least. These issues were less of the outside world changed things, but more that I setup some possibilities of repave issues.
Once again, this reminds me that repaving drives out a lot of automation bugs than re-running automation after a patch.
Attached is the new machine details:
Calling this done... will probably follow up with some potential improvements tomorrow, but as of now, this is a functional pi... I mean pippin.
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
Linux pippin 6.6.31+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.6.31-1+rpt1 (2024-05-29) aarch64 GNU/Linux
Yay for Repaving!
As much as possible is documented inline in this issue template. In case of problems you may find help by viewing all the previous repave issues. Have fun!
Things to do with the existing build
[x] Enable DHCP on the router, remove port mapping and statically assign network to PC
Insert screenshots here ;-)
[x] Shutdown PI
Make sure the USB drive has spun down before doing any work.
sudo shutdown -h now
[x] Create SD card with the latest Raspberry Pi OS
Using the SD card in the now powered down PI.
The new installer has options to enable SSH and create a user.
installer download
note check if the underlying Debian distribution is changing as this might result in some issues in the playbook execution.
The Bookworm 64-bit lite image seems to work for now. note as of
v1.8.4
of the Imager software, ensure to not selectno filtering
in the Raspberry Pi Device filter.Post OS install steps on desktop
[x] Ensure a working ansible enviroment
This will exercise the
asdf
setup.[x] Turn on the PI and note the IP obtained from the Router
[x] Clean up old host keys
The new instance will have new host keys so to ensure host key warning messages don't distract us from the repaving, run the following:
[x] Transfer local public ssh key to PI
In order to avoid the use of
sshpass
, copy the current sessions public ssh key to to./ssh/authorized_keys
of thepi
user on the PI. This user is only necessary to run the bootstrap playbook (which creates an adminansible
user) and will be subsequently cleaned up.ssh-copy-id pi@<pi ip>
[x] Bootstrap with Ansible
./ansible.sh
and select thebootstrap-playbook.yml
[x] Add the PI port forwarding
Needed for the
certbot
ACME challenge in the next step.[x] Complete full configuration
./ansible.sh
and select themain-playbook.yml
[x] Reboot PI
[x] Re-add port mapping to the static IP
[x] Disable DHCP on the router
[x] Deploy goodenoughmoney.com
[x] Clean up host key for ephemeral IP
Remove host key reference to the temporary IP that was used to bootstrap the device. This cleanup will ensure that an error won't occur in the next refresh if the same IP is used again.
[x] Make this template slightly better
How Do I Know I Am Done?
[x] https://www.goodenoughmoney.com/ displays stuff
[x] https://home.scottmuc.com/music/ loads navidrome and the music is playable
[x] http://prometheus.home.scottmuc.com:9090/ loads and has data
[x] http://grafana.home.scottmuc.com:3000/ loads and has data
[x]
ipconfig /release
and thenipconfig /renew
works[x]
nslookup analytics.google.com
is refused[x] Print out newly repaved machine details
cat /etc/os-release && uname -a