Closed Nadahar closed 2 years ago
openHABian isn't really doing anything to DHCP IP assignment so what you see is pretty much what you get from stock Raspberry Pi OS. As you say you don't have any problem I am not seeing what we as openHABian should be doing about it. To get your question answered I suggest you check out some Raspberry Pi OS forum. Please let us know the outcome here, too.
I'm not to keen on contacting Raspberry Pi OS just to have them tell me to contact either you or Debian. So, I've tried to dig deeper to figure out whom to address. It hasn't really made me any wiser.
First of all, I'm in way over my head, I don't think I've touched Debian (including Ubuntu and other derivatives) since 2012-13. Their dependency management really disappointed, and it seems like not much have improved since then. What I'm trying to say is that I'm in no way sure that my findings are correct, but this is what I think is happening.
The "source" of the problem, arguably, seems to be ifupdown
, which is used implicitly by Debian to process the interfaces
configuration. Look at the implementation of the dhcp
method here: https://salsa.debian.org/debian/ifupdown/-/blob/4352ab3b8bafc0a73e2aed1f697d01cab29be4a6/inet.defn#L78-L109
The description even states specifically that:
This method may be used to obtain an address via DHCP with any of the tools: dhclient, udhcpc, dhcpcd (They have been listed in their order of precedence.).
I'm not sure what "language" is used in the file, but it seems to me like the use of the different DHCP clients is hardcoded, and that it will use whichever client it finds in the listed order. This means that if dhclient
is installed, ifupdown
will use it, not dhcpcd
even if it is present too.
Since Debian uses dhclient
as the "standard" DHCP client, I guess one could argue that this makes some sense, but it isn't very flexible to say the least.
This seems to me to explain why dhclient
is started - the last line in /etc/network/interfaces
which reads iface default inet dhcp
will fire up dhclient
if it exists in /sbin
.
dhcpcd
is started by Systemd because it is installed and "enabled":
● dhcpcd.service - DHCP Client Daemon
Loaded: loaded (/lib/systemd/system/dhcpcd.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/dhcpcd.service.d
└─wait.conf
Active: active (running) since Sun 2022-06-05 16:51:15 CEST; 8h ago
Docs: man:dhcpcd(8)
Main PID: 911 (dhcpcd)
Tasks: 1 (limit: 4915)
CPU: 826ms
CGroup: /system.slice/dhcpcd.service
└─911 /usr/sbin/dhcpcd -w -q
The question is then, why are they both there - and who's "fault" is it? As I understand it, Raspberry Pi OS has "replaced" dhclient
with dhcpcd
for some reason I'm not sure of, with the argument that it works better with the RPi presumably. They have probably made sure that it is started by SystemD, and most likely also makes sure that dhclient
isn't installed by default.
I'd still argue that this is a pretty "brittle" construct, as quite a few other packages will install dhclient
and wreck this whole setup. On the other hand, with Debian's hardcoding of the dhclient
preference, I'm not sure what would be the best solution. Using interfaces
might have to be avoided completely, or a custom build of ifupdown
where the hardcoded preference is changed be made.
Raspberry Pi OS will probably still claim that this isn't "their problem", since they make sure not to include dhclient
(and as we all know with Linux, "everything" is the users' responsibility to handle if they dare change something).
So, I've tried to figure out why dhclient
is installed. I certainly didn't install it. There's no package called "dhclient", but I manage to figure out that I think it's in the isc-dhcp-client
package, which is installed:
> apt-file search sbin/dhclient
isc-dhcp-client: /sbin/dhclient
isc-dhcp-client: /sbin/dhclient-script
isc-dhcp-client-ddns: /sbin/dhclient
When trying to source the origin of isc-dhcp-client
using both /var/lib/apt/lists/raspbian.raspberrypi.org_raspbian_dists_bullseye_main_binary-armhf_Packages
and apt-cache rdepends --installed <package>
, I end up with the following: Nothing depends on isc-dhcp-client
, but several packages "recommend" it. Both avahi-autoipd
and ifupdown
are installed and "recommend" isc-dhcp-client
. Nothing depends on ifupdown
, but since it's an "essential" part of Debian I assume that it's "preinstalled" and most likely hasn't triggered the installation of isc-dhcp-client
.
Following the trail with avahi-autoipd
leads to avahi-daemon
and libnss-mdns
, which ends up being the same thing since the only installed package that depends on avahi-daemon
is libnss-mdns
. The only installed package that depends on libnss-mdns
is openjdk-11-jre-headless
.
apt-config dump
reveals that:
APT::Install-Recommends "1";
APT::Install-Suggests "0";
For a long time I couldn't understand this, since I didn't think "recommended" packages was installed. But, it turns out they are - by Debian default from what I can understand.
To sum it up, it seems like
openjdk-11-jre-headless -> libnss-mdns -> avahi-autoipd -> isc-dhcp-client
This means that dhclient
is installed and thus preferred by ifupdown
, which breaks the DHCP client setup. This feels a lot like a "circular firing squad". Debian will say it's not their problem because it works fine as long as you just install one DHCP client. Raspberry Pi OS will say it's not their problem because it works fine as long as you don't install dhclient
. You (openHABian) will say it's not your problem because Java is necessary to run openHAB and they fault is really upstream.
I don't really know who to "blame", but I don't think I'm the only one experiencing this problem. It seems to me like this should be a very common situation. I can't experiment since as long as I don't have a microHDMI adapter I can't risk doing something that makes DHCP stop working, as I would be unable to contact the RPi again without reflashing the SD card. When I get it some time in the future, I can try to remove isc-dhcp-client
and see if it corrects the behavior. It still wouldn't solve the problem though, because it would be hard to prevent it from being installed again. I don't know if ATP installs "recommended" packages during "upgrade", but I wouldn't be surprised. If so, it probably wouldn't be long until it was back.
It's really "unfortunate" that I, with so little knowledge of these systems, should be the one to figure out what would be a proper solution. For every step outlined above, I've had to search and read. It takes a lot of time, and anybody that knows this a bit better would figure things out much quicker.
To sum it all up, I don't have a solution, but I think I might have found the cause.
While checking my own logic, I found that this "chain" shown in my previous post is false:
openjdk-11-jre-headless -> libnss-mdns -> avahi-autoipd -> isc-dhcp-client
The reason is that there are multiple "suggests" in this chain, which won't be automatically installed. In fact, openjdk-11-jre-headless
only suggests libnss-mdns
which again only suggests avahi-autoipd
(but, it depends on avahi-daemon
, which also merely suggests avahi-autoipd
. avahi-autoipd
recommends isc-dhcp-client
though, so at least the last "link in the chain" is true.
This led me to look further for the cause of avahi-autoipd
being installed, and I think I've found the culprit: https://github.com/openhab/openhabian/blob/5a5110ef01fadbfa2cd252c8b1f13e70cf9fe9e6/functions/system.bash#L55-L64
I don't know why you have chosen to add it, but it's in fc1943afa22208a40e0c372d6271989f9af24adb from #434, which again comes from #433. I guess AutoIP (169.254.0.0/16) can be useful in some extremely rare situation when the network is completely ad-hoc, but the vast majority will have a router in some shape or form that give them Internet access, which makes AutoIP useless. With reference to the situation in #433, installing avahi-autoipd
in Ubuntu/Mint isn't an issue, since they don't rely on dhcpcd
. This isn't the case for Raspberry Pi OS though, making this a much more "troublesome" choice.
As far as I can understand, this is also the cause of #1456, which means that 99a8be1770bc95d4925ad44cd516531ba25ce851 is kind of pointless. 99a8be1770bc95d4925ad44cd516531ba25ce851 disables the very functionality (AutoIP) that avahi-autoipd
adds - but the problem that dhclient
is installed still remains.
I took the chance and uninstalled avahi-autoipd
and isc-dhcp-client
and rebooted. It was some tense seconds, but it came back up - this time with only one IP address.
Again, I have a really hard time to think this only is a problem with my installation, it's very easy to check:
ip addr show
The command will list the network interfaces with the assigned IP addresses below. There should be no more than one type of IP address (IPv4 and IPv6) for each interface when using DHCP.
I just tried and removed avahi-autoipd and isc-dhcp-client, too.
After the next reboot the raspi didn't connect to wlan. I could still connect with ethernet and reinstall isc-dhcp-client .
After that it connects again with wlan.
That's strange - have you checked if you have both DHCP clients running (dhclient
and dhcpcd
)? I guess you can do it just with ps
- I looked in the "system" log to watch what happened during boot though.
edit: I'm doing the "installation" of my new image now - my Internet connection isn't the fastest, so all the downloading takes a while - but it is connected on WLAN as it should and I can watch the progress in the browser.
edit2: I think this should be an easy way to check which DHCP clients are running: ps -ax | grep -E 'dhcp|dhclient'
My test image is done installing. It didn't work as intended though, avahi-autoipd
isn't installed now, but isc-dhcp-client
still is - so something else must have installed it.
Now with ps I only see dhclient but I didn't check before the uninstall.
Am 6. Juni 2022 19:36:30 schrieb Nadar @.***>:
That's strange - have you checked if you have both DHCP clients running (dhclient and dhcpcd)? I guess you can do it just with ps - I looked in the "system" log to watch what happened during boot though.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.
@Larsen-Locke That makes sense - the question then is if you have manually disabled dhcpcd at some point in the past.
Try running systemctl status dhcpcd.service
to check the status of the dhcpcd service. If it's not "enabled", run:
systemctl enable dhcpcd.service
(you might need to prefix it with sudo
)
This is how openHABian is configured by default, which causes both to run at the same time. Once it's running, removing isc-dhcp-client
should work.
That's strange - have you checked if you have both DHCP clients running (
dhclient
anddhcpcd
)?
I've actually never seen dhclient run om recent times, and I've vuilt and flashed many many images. My standard test box has an Ethernet connected, maybe that's why.
This is how openHABian is configured by default, which causes both to run at the same time.
I have not cross-checked but believe that with your PR (i.e. to not install avahi-autoipd) openHABian should be just like stock Raspi OS. So it's possibly already like that in there ?
My standard test box has an Ethernet connected, maybe that's why.
I won't claim to understand the logic of /etc/network/interfaces
completely, but it could be. This is how it looks after a "fresh openHABian install":
# interfaces(5) file used by ifup(8) and ifdown(8)
# Include files from /etc/network/interfaces.d:
source /etc/network/interfaces.d/*
allow-hotplug wlan0
iface wlan0 inet manual
wpa-roam /etc/wpa_supplicant/wpa_supplicant.conf
iface default inet dhcp
I'm pretty confident that what "launches" dhclient
is the last line, with the magic "dhcp" command. The whole logic with "default" eludes me though - and so does the WPA stuff. I would assume that WPA would never be initialized when using ethernet, so maybe that means that he next line won't either? To really know how this file is parsed I fear I'd have to dive deep into the Debian code.
What I have found, is that the default for Raspberry Pi OS is quite different. Their image comes with an interfaces
file that looks like this:
# interfaces(5) file used by ifup(8) and ifdown(8)
# Include files from /etc/network/interfaces.d:
source /etc/network/interfaces.d/*
...and an interfaces.new
that looks like this:
# interfaces(5) file used by ifup(8) and ifdown(8)
# Please note that this file is written to be used with dhcpcd
# For static IP, consult /etc/dhcpcd.conf and 'man dhcpcd.conf'
# Include files from /etc/network/interfaces.d:
source-directory /etc/network/interfaces.d
I'm starting to suspect that isc-dhcp-client
is a "standard" Debian package that will always be there unless you explicitly uninstall it. So, it seems to me like the Raspberri Pi OS prevents dhclient
from running simply by making sure that dhcp
doesn't exist in the interaces
file.
I have not cross-checked but believe that with your PR (i.e. to not install avahi-autoipd) openHABian should be just like stock Raspi OS. So it's possibly already like that in there ?
Yes, as far as I can tell, avahi-autoipd
isn't in their image and removing it should as such not pose a problem. I think the PR is "safe" in that sense, I am already testing it. But, isc-dhcp-client
is still there (and the double IP address) despite the PR, so my assumption that avahi-autoipd
caused isc-dhcp-client
to be installed seems to be false. I've found vague references online that it's "always there" on Debian, so I now suspect that it's there from the very start.
After some acrobatics, I managed to connect my RPi to ethernet - and it didn't make any change for me. Now it assigns three addresses - one for eth0
and two for wlan0
.
edit: After checking the startup log, IP's are assigned in this order:
I checked the status of dhcpd service and it was not enabled. I enabled it and checked the status:
Warning: The unit file, source configuration file or drop-ins of dhcpcd.service
● dhcpcd.service - dhcpcd on all interfaces
Loaded: loaded (/lib/systemd/system/dhcpcd.service; enabled; vendor preset: e
Drop-In: /etc/systemd/system/dhcpcd.service.d
└─wait.conf
Active: failed (Result: exit-code) since Mon 2022-06-06 22:59:53 CEST; 54s ag
Process: 365 ExecStart=/usr/lib/dhcpcd5/dhcpcd -q -w (code=exited, status=6)
Jun 06 22:59:52 rapi2 systemd[1]: Starting dhcpcd on all interfaces...
Jun 06 22:59:52 rapi2 dhcpcd[365]: Not running dhcpcd because /etc/network/inter
Jun 06 22:59:52 rapi2 dhcpcd[365]: defines some interfaces that will use a
Jun 06 22:59:52 rapi2 dhcpcd[365]: DHCP client or static address
Jun 06 22:59:53 rapi2 systemd[1]: dhcpcd.service: Control process exited, code=e
Jun 06 22:59:53 rapi2 systemd[1]: dhcpcd.service: Failed with result 'exit-code'
Jun 06 22:59:53 rapi2 systemd[1]: Failed to start dhcpcd on all interfaces.`
Wlan was installed by openhabian-config.
/etc/network/interfaces: `# interfaces(5) file used by ifup(8) and ifdown(8)
source-directory /etc/network/interfaces.d
allow-hotplug wlan0
iface wlan0 inet manual
wpa-roam /etc/wpa_supplicant/wpa_supplicant.conf
iface default inet dhcp`
@Larsen-Locke I find this message interesting:
systemd[1]: Starting dhcpcd on all interfaces...
dhcpcd[365]: Not running dhcpcd because /etc/network/inter
dhcpcd[365]: defines some interfaces that will use a
dhcpcd[365]: DHCP client or static address
systemd[1]: dhcpcd.service: Control process exited, code=e
systemd[1]: dhcpcd.service: Failed with result 'exit-code'
systemd[1]: Failed to start dhcpcd on all interfaces.
The fact that dchpcd
refuses to start seems to be an attempt at a "protection" against this very problem. I found this question which shows the same behavior. I've been wondering why this "protection" doesn't work on my installation, but I noticed that the question was about an older installation Raspbian GNU/Linux 9 (stretch)
.
What version are you running (cat /etc/os-release
)?
VERSION_ID="10" VERSION="10 (buster)" VERSION_CODENAME=buster ID=raspbian
I'm running 11/bullseye, maybe something has changed here that has made the "protection" in dhcpcd
fail..
It now seems reasonably clear to me that isc-dhcp-client
installed as a "part of Debian itself". It has a priority of "important", which is the second highest priority in the "Debian policy" and as such will always be installed even by the "minimal" installer. This list of minimal packages also supports this.
So, it seems that the idea of relying on dhclient
not being there is futile. It is of course possible to uninstall it, but I would guess that since it's "assumed to be there by default" all kind of strange things could happen. When you combine this with the hard-coded preference it gets from ifupdown
makes it very hard to actually use interfaces
on a Debian based system and an alternative DHCP client.
I guess this explains why Raspi OS/Raspbian has chosen not to use interfaces
for network configuration. It also questions the decision of actually using it in openHABian.
The things I still need to figure out is why the "protection" that disabled dhcpcd
doesn't always "work", and why we need dhcpcd
at all (why was it chosen for Raspian?). Is this all related to the "automatic hotspot" fallback functionality?
Disabling dhcpcd
is easy, but I would assume that it's there for a reason.
This also seems to confirm the above assumption: https://serverfault.com/questions/1065565/how-to-run-dhcpcd-on-interface-eth1-only/1065571#1065571
This really smells a lot like a "war of philosophy" between different "factions" to me, I really hope it's not and that there's an actual good reason for this situation.
It seems like dhcpcd might have a very uncertain future: http://roy.marples.name/archives/dhcpcd-discuss/0003457.html
I've been trying to find some reasoning for using dhcpcd
, but to no avail. I'm sure there must be some reason, but the first GitHub commit for raspberrypi-net-mods was done after the "switch": https://github.com/RPi-Distro/raspberrypi-net-mods/commit/bb0c51beacddb433a348f365556f7c3f348a3b41. According to this forum post no version control (git, svn etc) exists before this, so I don't know if the reason for this can be found in public. It might lurk in a forum somewhere, but it feels a lot like looking for a needle in a haystack.
I've made some progress, although I've not come to the bottom of this.
I have identified the "protection" code that prevents dhcpcd
from starting under some circumstances, which prevents the double IP issue. It's in the package dhcpcd5
, in the "init script" found here: https://sources.debian.org/src/dhcpcd5/7.1.0-2/debian/dhcpcd5.dhcpcd.init/#L45-L51
INTERFACES=/etc/network/interfaces
if grep -q "^[[:space:]]*iface[[:space:]]*.*[[:space:]]*inet[[:space:]]*dhcp" \
$INTERFACES; then
log_failure_msg "Not running $NAME because $INTERFACES"
log_failure_msg "defines some interfaces that will use a" \
"DHCP client"
exit 6
fi
This is very crude and unsophisticated, as you can see it does a simple regex check in /etc/network/interfaces
for if dhcp
is used after iface
and inet
. It fails to check files in /etc/network/interfaces.d/
...
It seems to me like the above code is all that prevents the "double IP" bug from happening to everybody. What I do not (yet hopefully) understand is why this doesn't "protect" my installation.
To try to figure this out, I've now installed openHABian 1.6.6. In addition I did not disable IPv5 in /boot/openhabian.conf
before first boot, like I've done with previous installations. On this installation, this "protection" works also for me - dhcpcd
is prevented from running and I only get one IP address on wlan0
. The question now is which of the two changes I made (openHABian version or disable IPv6) that made the difference.
1.6.6 runs buster, not bullseye, and the version of dhcpcd5
is 1:8.1.2-1+rpt1
. On the latest version, the version of dhcpc5
is 1:8.1.2-1+rpt5
. It's not clear to me if this is what makes the difference though, since the "script" is also there in 8.1.2
- it just seems that it's never run for some reason.
I guess the next step now is to install the latest version without disabling IPv6. It's not that pinpointing exactly what change triggered this "solves" the problem, the whole setup is quite "brittle" as it is. If the situation is like it looks at the moment, it seems to me that what makes openHABian "behave correctly" is that dhcpcd
is prevented from starting. If so, the easy solution would be to just disable the service, and this would all be handled "the standard Debian way" using dhclient
. Still, I'd like to know exactly why this happens, to better understand what affects what.
I guess the next step now is to install the latest version without disabling IPv6
Which is what I always do. Note I do NOT see dhclient but dhcpcd IS started. (then again note my box is on ethernet)
Which is what I always do. Note I do NOT see dhclient but dhcpcd IS started. (then again note my box is on ethernet)
If that's the case, then this is even stranger. The way I now only have one IP with 1.6.6, and @Larsen-Locke from what I understand, is because dhcpcd
is prevented from running (so that dhclient
does the job). Your installations on the other hand "works" because dhcpcd
is running and dhclient
is not...
Maybe I have to try to do an installation with Ethernet connected as well, just to compare. That said, since you have Ethernet connected, that probably means that you don't fill in SSID/PSK for the Wifi? That could potentially change things I guess.
I reinstalled 1.7.3 without disabling IPv6. It's the same, both dhclient
and dhcpcd
running each leasing one IPv4 address for wlan0
. So, it seems like it is caused by something that has changed between the versions - I assume between Buster and Bullseye.
that probably means that you don't fill in SSID/PSK for the Wifi?
Yes I don't. And I'm only using testing with latest openHABian. Speaking generally, openHABian users should not use WiFi for reliability reasons if they can avoid it. Which is probably what most do and why there's noone to have hit your issue before (or noone that cares as much as you do - much appreciated). But users should also not run multihomed either, no matter if 2xWiFi like your case, but also not Eth+WiFi. You ultimately should be having a single IP address only (well plus localhost).
I reinstalled 1.7.3
Please for future experiments only work based on latest openHABian, main
branch so any new users can benefit from this, too.
There have been too many unknown changes in the meantime such as the buster->bullseye move.
You should be upgrading (or better: reinstall) your production system, too.
My primary reason for moving to a RPi is to make it robust. I was planning to run it on a Linux VM on a server originally, but it's too much hassle each time there's a thunderstorm or to power is out. I've equipped it with a PiJuice HAT so that I can run for many hours without being connected to power. Normally I "hate" WiFi, but in this case I want to use WiFi exactly so that it won't be connected via a Ethernet cable and risk being damaged by lightning. Whenever a lightning storm comes close, I just unplug the power and it can keep on doing its thing. Ideally I wouldn't want it to use DHCP, because a static configuration is "safer" in that it doesn't require contact with a DHCP server during boot. But, since you recommended against using static IP I was thinking of using DHCP, although I'm not quite convinced that I want to do that yet.
I'm not just doing this to "solve my problem", it would be much easier for me to just stop dhcpcd
and configure it statically. I'm doing it because I'm convinced that there's a bug here, and I'm trying to get to the bottom of it before moving on. I haven't gotten all my parts yet, including the "endurance" SD card I'm going to use in "production", so my "production system" doesn't exist yet.
I've only done one experiment with 1.6.6, and that was because I was out of ideas. I think that was useful, because it showed that the dual IP problem isn't there with 1.6.6/buster. I'm still not sure what exactly has lead to the change, but I suspect that it's at the Debian or perhaps Raspi OS level. Except for that, I've been using 1.7.3 just without avahi-autoipd
installed. I really don't think avahi-autoipd
has anything to do with this anymore - the reason I thought so in the beginning was that I though installing that was what implicitly installed isc-dhcp-client
.
My curiosity wants me to pinpoint the exact change that has triggered the change in behavior, but pragmatically speaking it might not matter that much. The fact is still, as I now understand it, is that one should not both run dhcpcd
and use the dhcp
keyword in /etc/network/interfaces
on the same installation. I think that's where the "real" solution to this whole issue lies.
I'm not sure I know enough about all the circumstances this is meant to solve, I mean with "failover" from Ethernet to WiFi or vice versa, plus the WiFi hotspot function. That makes it hard to suggest a different configuration that still takes it all into account. Testing failover etc. would also be immensely easier once I have the microHDMI adapter if I lose network access.
Your work is very much appreciated, and be your motives just 'egoistic', i.e. curiosity and the strong willingness to understand things. That being said, for your production setup, I'd still recommend to use Ethernet only. If you're really so afraid of lightning to hit, you can also get a cheapish external UPS that also can protect the Ethernet cable from overvoltage. Given a RPi is less than 50 bucks though, consider just taking the risk instead. That's a real advantage of these neat boxes many people don't see: they're cheap to replace so no (or less) need to invest in 'box' reliability such as dual power supplies, venting, battery backup etc. like you would do with 'big iron' in data centers. Just have a spare on site and you're prepared.
I'd not aim for a multihomed system with Eth-WiFi failover. I'm sure there will be issues with that way beyond interface configuration, such as services to bind to and be addressed on only one of the interfaces, so ultimately even if you properly worked out how to setup this (which would be great!), I wouldn't think it will ultimately do what you want it to i.e. improve resilience. Also consider that any lightning-class disaster will likely affect more of yoiur hardware than just the RPi such as your WiFi AP, router and electrical infrastructure. Consider having a secondary SD that's configured to run on WiFi only. That you can use in case your Ethernet is unplugged or broken. If you assign the IP via DHCP based on MAC that'll effectively get you the same IP no matter if you boot with the standard 'Ethernet' SD or the WiFi one.
I haven't done anymore testing, but I'd just like to address a couple of points.
I'm not "afraid" of lightning, we've had to replace two fluorescent light fixtures that has been "fried" by lightning in the last three years or so. We routinely unplug all equipment that we deem "at risk", and still things break like permanently connected roof lights. We live close to the top of a hill, so we might be extra exposed, or the electric lines in the area might be extra vulnerable. I don't know, I just know that this is a very real threat, and that when I've removed the lights, their plastic has been so brittle that they have more or less just fallen apart, and they have smelled quite bad. No electronic equipment is going to survive it, I'm pretty sure of that. Thunderstorms are very common here from July to the end of August or so, so it's something we have to "live with". In addition it's quite common with power outages from storms from November to January from fallen trees.
I also have other equipment that I have to shut down and start up on both sides of such "events", and if often takes me almost an hour after the power is back on or the thunderstorm is over for me to get everything back up and running again. My whole incentive for moving to the RPi was to eliminate one of these things, I don't want to have to shut down and restart it. I'll have to disconnect the power supply during thunder storms, but that's all I want it to involve. Most of the things being controlled by openHAB are z-wave devices, so it doesn't really matter so much if I lose network connectivity to the RPi during an "event". The most important thing for me is that openHAB can keep running, keep communicating with the devices and be ready to go when it's all over, without all the reinitialization and general confusion that exists after a shutdown.
I'm considering a cheap UPS for the "main switch" and the WiFi antenna, but that's just for the luxury to be able to connect from laptops and mobile devices during an "event". But, having Ethernet connected is a risk in itself, it's enough that you've forgotten to disconnect the power to just one of the other wired devices and you can risk frying everything else. I'm not saying that is likely, but it's possible.
Since the network connectivity to the RPi isn't my "primary concern", I think relying on WiFi is quite acceptable. I do of course want a setup where I can plug in an Ethernet cable and get connected while its running should the need arise. To make that work, it would be preferable to have eth0
configured with at static IP, so that it is configured despite not reaching a DHCP server at boot time. As I see it, it's vital to be able to connect to order a shutdown to avoid corrupting the file system. Especially with the heavy memory cache use openHABian is configured for. Luckily the PiJuice can initiate halt
when the battery reaches a configurable level or when one of its buttons are held for a configurable period of time. This means the risk of a "dirty shutdown" should be minimal.
I'm in the middle of "building" a PWM fan solution with real fan speed control for it now too - quite opposite your "disposable" philosophy. There are two reasons why I see it differently: 1) Money isn't the only concern, all the hassle, the downtime, the ordering of new parts, having to remember all the details I had already forgotten etc. is also a "cost". 2) They aren't really cheap at this time at least, because of the general "situation" in the world. I bought my 4B 4GB for around €90 second hand. If I wanted to buy one new here in Norway, I'd have to either order it from abroad pay insane import fees and tax (only fees and taxes is more than €50) or I'd have to pay more than €100 and wait until October or November to receive it (according to their "estimates"). So, for that to be any kind of "safety", I'd really have to buy two from the beginning, so that I already had a spare one. I'd still have to replace it at a time that wouldn't necessarily be very convenient. If it were to die, I think it would be easier for me to just "transfer" the openHAB configuration to a server running a VM while I wait for a new one. If they become readily available and cheap again in the future, I might see this differently though.
Regarding the failover I mentioned I think you misunderstood me somewhat. I wasn't thinking of a failover that would actually let openHAB continue to play ball nicely. I was rather thinking of something that makes sure you can still connect via SSH to initiate a graceful shutdown or whatever the need would be. I agree that making everything work with multiple IP addresses isn't very realistic, it's the same reason why I'm pretty sure that the current "double IP assignment" issue will pose problems. When making software binding to sockets, there's no "good" solution for handling this. Sometimes the circumstances will allow you to bind to any/0.0.0.0, but all too often that's not possible for one reason or another. Binding to multiple addresses then usually means running multiple threads listening on multiple sockets and then coordinate the resulting mess. Guessing the "correct" IP to use if you have to just pick one is also very difficult, I have one such algorithm that I've had to revisit time and time again because it's simply "impossible" to make it anything close to optimal. So, I expect software generally not to handle this very elegantly, the most sensible option is probably to make it a configuration option and then just make a "wild guess" if nothing is configured. That still makes multiple addresses on DHCP a challenge, you can't configure a dynamic address in a configuration file (or at least you shouldn't), and the logic for "guessing" will vary. Then there's handling that this changes while the program is running - that won't happen unless the software is explicitly written to handle such events. To sum it up, I was never thinking having about having openHAB keep working during a failover event, I just wanted to make sure SSH was still available.
I'm not sure I see the benefit of having two SDs with different configurations, it would mean that all the z-wave stuff, persistence, states etc would be out of whack anyway. It would essentially achieve nothing more than copying the openHAB configuration to a new installation on a computer/VM would achieve.
I have a small update. I've finally had the time to do another "fresh install" of the latest openHABian version 1.7.3, without switching to main
branch or doing any other kind of customization. The only thing I modified in /boot/openhabian.conf
before installation was hostname
(and that really shouldn't impact anything). I did not configure WiFi.
As expected, it started up with just wired network (eth0
), a single IPv4 address and with only dhcpcd
running. But, as soon as I start openhabian-config
, configure WiFi and reboot I have two wlan0
IPv4 addresses with both dhclient
and dhcpcd
running. eth0
still only have a single IPv4 Disabling WiFi again makes it return to the initial state where I only have a single IPv4 address.
I strongly suspect that this issue exists on all Bullseye based openHABian installation with WiFi configured.
I don't think this is directly related to the double IP issue, but there is an issue with the script that "enables WiFi". I've seen it several times now, it will give this error "seemingly out of the blue" while it will work at other times:
┌──────────────────────────────────────────────────────────────────────────────┐
│ │
│ There was an error or interruption during the execution of: │
│ "30 | System Settings" │
│ │
│ Please try again. If the error persists, please read │
│ /opt/openhabian/docs/openhabian-DEBUG.md or │
│ https://github.com/openhab/openhabian/blob/main/docs/openhabian-DEBUG.md how │
│ to proceed. │
│ │
│ │
│ <Ok> │
│ │
└──────────────────────────────────────────────────────────────────────────────┘
set debugmode=maximum in openhabian.conf to see more
set debugmode=maximum in openhabian.conf to see more
I have found the reason, see #1693
As far as I can understand at this point, the cause of the double IP issue is fundamentally that /etc/network/interfaces
is populated with WiFi configuration by wifi.bash
. This is what triggers dhclient
- which is what is "wrong" given Raspbian/RaspiOS's decision to use dhcpcd
instead.
It further looks like something has changed in Debian between Buster and Bullseye (I haven't managed to pinpoint exactly what, but I don't think that is very important) that have somehow "disabled" the "protection" in dhcpcd5
where it refuses to start if /etc/network/interfaces
is configured with a DHCP configuration. This was never a proper solution anyway, but it masked the underlying problem in this case - the fact that both dhclient
and dhcpcd
are configured to serve as DHCP clients.
It's clear that the "offending configuration" is created by wifi.bash
. The reason why this was done isn't so easy to deduce. Trying to trace where this comes from, I've come up with 786698bb2afacb1535e58ce7dfcbec5ba4383f1e as the source:
https://github.com/openhab/openhabian/blob/786698bb2afacb1535e58ce7dfcbec5ba4383f1e/openhabian-setup.sh#L374-L378
This then goes on a voyage via 9b5b4982101dc60285021a68621ae2dd5273622d, b0beb7ff0fee9000005823c7b6317a06fa7e646e and c0d8f8ba943c228c071563b7887946fd48532261 before it lands where it is today: https://github.com/openhab/openhabian/blob/5a5110ef01fadbfa2cd252c8b1f13e70cf9fe9e6/functions/wifi.bash#L95-L99
The exact purpose of this code is still unclear to me. I'm not sure when the switch to dhcpcd
took place in Raspberry/RaspiOS, but maybe this code was written before the switch and simply never removed?
My WiFi seems to work just fine without it - with the result that only dhcpcd
is leasing an address.
Issue information:
My RPi 4 gets assigned two IPv4 addresses to
wlan0
via DHCP. I haven't really done anything to the installation, it's my first time trying openHABian, so the "installation" is pretty much as it was "flashed". The image used is the 32 bit version of openHABian v1.7.3.While I haven't experienced a problem as a consequence of the double IP assignment, I haven't started installing bindings and "moving" the configuration from my existing openHAB 2.4 installation (on a Windows box), so I have no idea if it will actually pose a problem or not once being "set to work". It doesn't look right anyway, and as I'm planning to reserve a fixed IP to the RPi's MAC address in the DHCP server, I'm worried that it could cause problems when a lease is requested twice with the same MAC address.
As Raspberry Pi OS/openHABian is new to me, and I've mostly been using Fedora when using Linux in the last years, I don't have much overview over what systems are used to configure the network and such. I have managed to find what seems to be the cause of the double IP address lease though, as show below: Both
dhclient
anddhcpcd
seems to be operating, each reserving one IPv4 address. IPv6 is disabled as it's not used on my home network. So, I guess the question is, why are they both making leases?Debug information:
System information:
32 bit openHABian 1.7.3 running from the onboard microSD card slot on a Raspberry Pi 4 4GB. No hardware (HATs, USB dongles etc.) attached to the RPi yet.