NetworkConfiguration / dhcpcd

DHCP / IPv4LL / IPv6RA / DHCPv6 client.
https://roy.marples.name/projects/dhcpcd
BSD 2-Clause "Simplified" License
348 stars 112 forks source link

Bonding (wire+wifi) and the randomise_hwaddr parameter #99

Open morfikov opened 2 years ago

morfikov commented 2 years ago

I noticed that the randomise_hwaddr option can be specified in the /etc/dhcpcd.conf file, and it's supposed to force a hardware address randomisation when the interface is brought up or when the carrier is lost. This works well for a standard interfaces like eth0. But I'm not really using eth0, and instead I use the linux bonding feature. My current setup utilizes two interfaces (one wired, and the other wireless), and they both are configured at the same time, but only one of them is actively passing packets (when the wire isn't plugged in, the system switches to WiFi automatically, and when the wire is plugged on, the system uses the wired connection, and so on). When I first tested the randomise_hwaddr option, I had the WiFi hard kill switch ON, which means the wlan0 interface wasn't available at the time of configuring the bond0 interface. In such case the randomise_hwaddr option does its job and I can see in the syslog the following messages:

systemd[1]: Starting networking.service...
systemd-udevd[191725]: Using default interface naming scheme 'v250'.
ifup[191724]: RTNETLINK answers: Operation not supported
ifup[191729]: Waiting for a max of 0 seconds for none to become available.
systemd-udevd[191735]: Using default interface naming scheme 'v250'.
ifup[191791]: RTNETLINK answers: Operation not supported
ifup[191795]: Waiting for a max of 0 seconds for none to become available.
kernel: bond0: (slave eth0): Enslaving as a backup interface with a down link
kernel: bond0: (slave wlan0): Opening slave failed
ifup[191896]: Failed to enslave wlan0 to bond0. Is bond0 ready and a bonding interface ?
ifup[191958]: no such user dhcpcd
ifup[191958]: dhcpcd-9.4.1 starting
dhcpcd[191958]: no such user dhcpcd
dhcpcd[191958]: dhcpcd-9.4.1 starting
ifup[191958]: DUID 00:04:d6:97:6e:01:51:46:11:cb:80:87:92:9c:ba:e3:d7:de
dhcpcd[191960]: DUID 00:04:d6:97:6e:01:51:46:11:cb:80:87:92:9c:ba:e3:d7:de
ifup[191958]: bond0: waiting for carrier
dhcpcd[191960]: bond0: waiting for carrier
ifup[191958]: bond0: carrier acquired
dhcpcd[191960]: bond0: carrier acquired
dhcpcd[191960]: bond0: adding address fe80::2af4:8fbb:6ef:786a
ifup[191958]: bond0: adding address fe80::2af4:8fbb:6ef:786a
dhcpcd[191960]: bond0: soliciting an IPv6 router
ifup[191958]: bond0: soliciting an IPv6 router
dhcpcd[191960]: bond0: carrier lost
ifup[191958]: bond0: carrier lost
dhcpcd[191960]: bond0: deleting address fe80::2af4:8fbb:6ef:786a
ifup[191958]: bond0: deleting address fe80::2af4:8fbb:6ef:786a
dhcpcd[191960]: ps_sendpsmmsg: Bad file descriptor
ifup[191958]: ps_sendpsmmsg: Bad file descriptor
dhcpcd[191960]: bond0: old hardware address: 3a:e5:ff:e0:15:65
ifup[191958]: bond0: old hardware address: 3a:e5:ff:e0:15:65
ifup[191958]: bond0: new hardware address: 82:ed:d5:6a:89:c3
ifup[191958]: bond0: old hardware address: 82:ed:d5:6a:89:c3
ifup[191958]: bond0: new hardware address: 3a:e5:ff:e0:15:65
dhcpcd[191960]: bond0: new hardware address: 82:ed:d5:6a:89:c3
dhcpcd[191960]: bond0: old hardware address: 82:ed:d5:6a:89:c3
dhcpcd[191960]: bond0: new hardware address: 3a:e5:ff:e0:15:65
kernel: e1000e 0000:00:19.0 eth0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
dhcpcd[191960]: bond0: carrier acquired
kernel: bond0: (slave eth0): link status up, enabling it in 0 ms
kernel: bond0: (slave eth0): link status definitely up, 1000 Mbps full duplex
kernel: bond0: (slave eth0): making interface the new active one
kernel: bond0: active interface up!
kernel: IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
ifup[191958]: bond0: carrier acquired
dhcpcd[191960]: bond0: adding address fe80::bbb7:ca7e:385b:2b19
ifup[191958]: bond0: adding address fe80::bbb7:ca7e:385b:2b19
dhcpcd[191960]: bond0: soliciting an IPv6 router
ifup[191958]: bond0: soliciting an IPv6 router
vnstatd[3773]: Info: Interface "bond0" enabled.
dhcpcd[191960]: bond0: Router Advertisement from fe80::ee08:6bff:fe84:15d5
dhcpcd[191960]: bond0: adding address fd80:24b5:db14:0:1f68:72ce:5886:f94b/64
ifup[191958]: bond0: Router Advertisement from fe80::ee08:6bff:fe84:15d5
ifup[191958]: bond0: adding address fd80:24b5:db14:0:1f68:72ce:5886:f94b/64
ifup[191958]: bond0: adding route to fd80:24b5:db14::/64
ifup[191958]: bond0: soliciting a DHCPv6 lease
dhcpcd[191960]: bond0: adding route to fd80:24b5:db14::/64
dhcpcd[191960]: bond0: soliciting a DHCPv6 lease
ifup[191958]: bond0: soliciting a DHCP lease
dhcpcd[191960]: bond0: soliciting a DHCP lease
dhcpcd[191960]: bond0: ADV fd80:24b5:db14::6ed/128 from fe80::ee08:6bff:fe84:15d5
ifup[191958]: bond0: ADV fd80:24b5:db14::6ed/128 from fe80::ee08:6bff:fe84:15d5
dhcpcd[191960]: bond0: REPLY6 received from fe80::ee08:6bff:fe84:15d5
ifup[191958]: bond0: REPLY6 received from fe80::ee08:6bff:fe84:15d5
ifup[191958]: bond0: adding address fd80:24b5:db14::6ed/128
dhcpcd[191960]: bond0: adding address fd80:24b5:db14::6ed/128
ifup[191958]: bond0: renew in 21600, rebind in 34560, expire in 43200 seconds
dhcpcd[191960]: bond0: renew in 21600, rebind in 34560, expire in 43200 seconds
ifup[191958]: forked to background, child pid 191960
systemd[1]: Finished networking.service.
dhcpcd[191960]: bond0: offered 192.168.1.232 from 192.168.1.1
dhcpcd[191960]: bond0: leased 192.168.1.232 for 43200 seconds
dhcpcd[191960]: bond0: adding route to 192.168.1.0/24
dhcpcd[191960]: bond0: adding default route via 192.168.1.1

But when I turn OFF the hard kill switch, and the WiFi card was available during configuration of the bonding interface, the randomise_hwaddr doesn't seem to work anymore:

systemd[1]: Starting networking.service...
systemd-udevd[186899]: Using default interface naming scheme 'v250'.
ifup[186898]: RTNETLINK answers: Operation not supported
ifup[186902]: Waiting for a max of 0 seconds for none to become available.
systemd-udevd[186909]: Using default interface naming scheme 'v250'.
ifup[186968]: RTNETLINK answers: Operation not supported
ifup[186972]: Waiting for a max of 0 seconds for none to become available.
kernel: bond0: (slave eth0): Enslaving as a backup interface with a down link
kernel: iwlwifi 0000:03:00.0: Radio type=0x1-0x2-0x0
kernel: iwlwifi 0000:03:00.0: Radio type=0x1-0x2-0x0
kernel: bond0: (slave wlan0): Enslaving as a backup interface with a down link
ifup[187154]: no such user dhcpcd
dhcpcd[187154]: no such user dhcpcd
ifup[187154]: dhcpcd-9.4.1 starting
dhcpcd[187154]: dhcpcd-9.4.1 starting
ifup[187154]: DUID 00:04:d6:97:6e:01:51:46:11:cb:80:87:92:9c:ba:e3:d7:de
dhcpcd[187156]: DUID 00:04:d6:97:6e:01:51:46:11:cb:80:87:92:9c:ba:e3:d7:de
ifup[187154]: dhcpcd_prestartinterface: Device or resource busy
dhcpcd[187156]: dhcpcd_prestartinterface: Device or resource busy
ifup[187154]: bond0: waiting for carrier
dhcpcd[187156]: bond0: waiting for carrier
kernel: e1000e 0000:00:19.0 eth0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
dhcpcd[187156]: bond0: carrier acquired
kernel: bond0: (slave eth0): link status up, enabling it in 0 ms
kernel: bond0: (slave eth0): link status definitely up, 1000 Mbps full duplex
kernel: bond0: (slave eth0): making interface the new active one
kernel: bond0: active interface up!
kernel: IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
ifup[187154]: bond0: carrier acquired
dhcpcd[187156]: bond0: adding address fe80::ac0d:6946:180:8093
ifup[187154]: bond0: adding address fe80::ac0d:6946:180:8093
vnstatd[3773]: Info: Interface "bond0" enabled.
dhcpcd[187156]: bond0: soliciting an IPv6 router
ifup[187154]: bond0: soliciting an IPv6 router
kernel: wlan0: authenticate with ec:08:6b:84:15:d4
kernel: wlan0: send auth to ec:08:6b:84:15:d4 (try 1/3)
kernel: wlan0: authenticated
kernel: wlan0: associate with ec:08:6b:84:15:d4 (try 1/3)
kernel: wlan0: RX AssocResp from ec:08:6b:84:15:d4 (capab=0x11 status=0 aid=2)
kernel: wlan0: associated
kernel: bond0: (slave wlan0): link status up, enabling it in 500 ms
kernel: bond0: (slave wlan0): invalid new link 3 on slave
dhcpcd[187156]: bond0: Router Advertisement from fe80::ee08:6bff:fe84:15d5
ifup[187154]: bond0: Router Advertisement from fe80::ee08:6bff:fe84:15d5
ifup[187154]: bond0: adding address fd80:24b5:db14:0:d336:ad06:30f9:a598/64
ifup[187154]: bond0: adding route to fd80:24b5:db14::/64
ifup[187154]: bond0: soliciting a DHCPv6 lease
dhcpcd[187156]: bond0: adding address fd80:24b5:db14:0:d336:ad06:30f9:a598/64
dhcpcd[187156]: bond0: adding route to fd80:24b5:db14::/64
dhcpcd[187156]: bond0: soliciting a DHCPv6 lease
dhcpcd[187156]: bond0: soliciting a DHCP lease
ifup[187154]: bond0: soliciting a DHCP lease
dhcpcd[187156]: bond0: offered 192.168.1.150 from 192.168.1.1
ifup[187154]: bond0: offered 192.168.1.150 from 192.168.1.1
dhcpcd[187156]: bond0: leased 192.168.1.150 for 86400 seconds
ifup[187154]: bond0: leased 192.168.1.150 for 86400 seconds
ifup[187154]: bond0: adding route to 192.168.1.0/24
dhcpcd[187156]: bond0: adding route to 192.168.1.0/24
ifup[187154]: bond0: adding default route via 192.168.1.1
dhcpcd[187156]: bond0: adding default route via 192.168.1.1
ifup[187154]: forked to background, child pid 187156
systemd[1]: Finished networking.service.
kernel: bond0: (slave wlan0): link status definitely up, 0 Mbps full duplex
dhcpcd[187156]: bond0: ADV fd80:24b5:db14::150/128 from fe80::ee08:6bff:fe84:15d5
dhcpcd[187156]: bond0: REPLY6 received from fe80::ee08:6bff:fe84:15d5
dhcpcd[187156]: bond0: adding address fd80:24b5:db14::150/128
dhcpcd[187156]: bond0: renew in 43200, rebind in 69120, expire in 86400 seconds

I'm using up to date Debian Sid with network configured via the /etc/network/interfaces file and the WiFi config is managed by wpa_supplicant via the /etc/wpa_supplicant/wpa_supplicant.conf file (the two things are set up via networking.service, which is not the native systemd's networking mechanizm). The wole setup has been working well for years (with dhclient).

Here's the config of the bond0 interface:

auto bond0
iface bond0 inet dhcp
    metric 10
    bond-mode active-backup
    bond-miimon 100
    bond-downdelay 500
    bond-updelay 500
    bond-primary eth0
    bond-primary-reselect always
    bond-fail-over-mac none
    bond-slaves eth0 wlan0
    bond-min-links 1

allow-bond0 wlan0
iface wlan0 inet manual
    bond-give-a-chance 1
    wpa-driver nl80211
    wpa-debug-level -1
    wpa-conf /etc/wpa_supplicant/wpa_supplicant.conf
    wpa-bridge bond0

Is there a way for randomise_hwaddr to work with bonding?

ido commented 2 years ago

Tangent: out of curiosity, why don't you use bridging rather than bonding for this use case (automatic failover)?

morfikov commented 2 years ago

I don't remember exactly why I stick with bonding (I've been using this setup for years), but bridging had/has some issues with WiFi.

rsmarples commented 2 years ago

Changing an interfaces hardware address is hard. Generally it can only be down when an interface is down and not being used by anything else. You would need to ask the kernel folks why you can't change a bonded interface's hardware address after adding a wireless interface.

If you really want dhcpcd to randomise the bonding interface my recommendation is to ensure dhcpcd is started as a service before the networking starts. This means dhcpcd has the best change of successfully changing the hardware address of the interface as soon as it appears and hopefully before other interfaces are enslaved to it.

Another option is not to use dhcpcd to randomise the bonding interface and use systemd to do it. Should be possible.

I'll also note that you've compiled dhcpcd with privsep support but not added the privsep user to the system! You should do this :) Also, there seems like a dhcpcd bug you've found because of this.

morfikov commented 2 years ago

I'll also note that you've compiled dhcpcd with privsep support but not added the privsep user to the system! You should do this :) Also, there seems like a dhcpcd bug you've found because of this.

I think, that's the default debian setup -- I just installed the package (I'm in the middle of moving away from dhclient).

If you really want dhcpcd to randomise the bonding interface my recommendation is to ensure dhcpcd is started as a service before the networking starts. This means dhcpcd has the best change of successfully changing the hardware address of the interface as soon as it appears and hopefully before other interfaces are enslaved to it.

I'll try this and see whether that helps.

morfikov commented 2 years ago

If you really want dhcpcd to randomise the bonding interface my recommendation is to ensure dhcpcd is started as a service before the networking starts.

It looks like it helped.

I just added allowinterfaces bond0 to the /etc/dhcpcd.conf file and started the dhcpcd.service .

systemd[1]: Starting dhcpcd.service...
dhcpcd[767340]: dhcpcd-9.4.1 starting
dhcpcd[767340]: dhcpcd-9.4.1 starting
dhcpcd[767340]: DUID 00:04:d6:97:6e:01:51:46:11:cb:80:87:92:9c:ba:e3:d7:de
dhcpcd[767343]: DUID 00:04:d6:97:6e:01:51:46:11:cb:80:87:92:9c:ba:e3:d7:de
dhcpcd[767340]: forked to background, child pid 767342
dhcpcd[767343]: no valid interfaces found
dhcpcd[767343]: no interfaces have a carrier
systemd[1]: Started dhcpcd.service.

Dhcpcd doesn't really do anything at this moment, since the bond0 interface isn't available at this point (it's configured via the networking.service). When I now start the networking.service with the WiFi and wire interfaces ON, then I get the following log:

systemd[1]: Starting networking.service...
systemd-udevd[767439]: Using default interface naming scheme 'v250'.
ifup[767441]: RTNETLINK answers: Operation not supported
ifup[767444]: Waiting for a max of 0 seconds for none to become available.
ifup[767507]: RTNETLINK answers: Operation not supported
ifup[767512]: Waiting for a max of 0 seconds for none to become available.
dnsmasq[213428]: Maximum number of concurrent DNS queries reached (max: 256)
dhcpcd[767343]: bond0: waiting for carrier
dhcpcd[767343]: bond0: old hardware address: 7e:32:73:79:d4:9c
dhcpcd[767343]: bond0: new hardware address: 42:cc:b3:31:ff:3a
dhcpcd[767343]: bond0: old hardware address: 42:cc:b3:31:ff:3a
dhcpcd[767343]: bond0: new hardware address: 7e:32:73:79:d4:9c
dhcpcd[767343]: bond0: carrier acquired
dhcpcd[767343]: bond0: adding address fe80::8e2d:f3d:6205:f738
dhcpcd[767343]: bond0: soliciting an IPv6 router
kernel: bond0: (slave eth0): Enslaving as a backup interface with a down link
kernel: iwlwifi 0000:03:00.0: Radio type=0x1-0x2-0x0
kernel: iwlwifi 0000:03:00.0: Radio type=0x1-0x2-0x0
kernel: bond0: (slave wlan0): Enslaving as a backup interface with a down link
dhcpcd[767343]: bond0: carrier lost
dhcpcd[767343]: bond0: deleting address fe80::8e2d:f3d:6205:f738
dhcpcd[767343]: dhcpcd_handlecarrier: Device or resource busy
ifup[767698]: sending commands to dhcpcd process
dhcpcd[767343]: control_free: No such file or directory
dhcpcd[767343]: control command: /sbin/dhcpcd -m 10 bond0
systemd[1]: Finished networking.service.
vnstatd[3773]: Info: Interface "bond0" enabled.
kernel: e1000e 0000:00:19.0 eth0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
kernel: bond0: (slave eth0): link status up, enabling it in 0 ms
kernel: bond0: (slave eth0): link status definitely up, 1000 Mbps full duplex
kernel: bond0: (slave eth0): making interface the new active one
kernel: bond0: active interface up!
kernel: IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
dhcpcd[767343]: bond0: carrier acquired
dhcpcd[767343]: bond0: adding address fe80::8e2d:f3d:6205:f738
kernel: wlan0: authenticate with ec:08:6b:84:15:d4
kernel: wlan0: send auth to ec:08:6b:84:15:d4 (try 1/3)
kernel: wlan0: authenticated
kernel: wlan0: associate with ec:08:6b:84:15:d4 (try 1/3)
kernel: wlan0: RX AssocResp from ec:08:6b:84:15:d4 (capab=0x11 status=0 aid=2)
kernel: wlan0: associated
kernel: bond0: (slave wlan0): link status up, enabling it in 500 ms
kernel: bond0: (slave wlan0): invalid new link 3 on slave
dhcpcd[767343]: bond0: soliciting an IPv6 router
dhcpcd[767343]: bond0: soliciting a DHCP lease
kernel: bond0: (slave wlan0): link status definitely up, 0 Mbps full duplex
dhcpcd[767343]: bond0: Router Advertisement from fe80::ee08:6bff:fe84:15d5
dhcpcd[767343]: bond0: adding address fd80:24b5:db14:0:1e18:d6d0:468a:f14c/64
dhcpcd[767343]: bond0: adding route to fd80:24b5:db14::/64
dhcpcd[767343]: bond0: soliciting a DHCPv6 lease
dnsmasq[213428]: Maximum number of concurrent DNS queries reached (max: 256)
dhcpcd[767343]: bond0: ADV fd80:24b5:db14::bd0/128 from fe80::ee08:6bff:fe84:15d5
dhcpcd[767343]: bond0: REPLY6 received from fe80::ee08:6bff:fe84:15d5
dhcpcd[767343]: bond0: adding address fd80:24b5:db14::bd0/128
dhcpcd[767343]: bond0: renew in 21600, rebind in 34560, expire in 43200 seconds
dhcpcd[767343]: bond0: offered 192.168.1.230 from 192.168.1.1
dhcpcd[767343]: bond0: leased 192.168.1.230 for 43200 seconds
dhcpcd[767343]: bond0: adding route to 192.168.1.0/24
dhcpcd[767343]: bond0: adding default route via 192.168.1.1

And here's the simple output:

# ip addr show dev bond0 | grep -i ether -A1

    link/ether 96:df:2a:4f:13:ab brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.238/24 brd 192.168.1.255 scope global dynamic noprefixroute bond0

# systemctl restart networking.service
# ip addr show dev bond0 | grep -i ether -A1

    link/ether 1e:50:ef:13:4b:09 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.128/24 brd 192.168.1.255 scope global dynamic noprefixroute bond0
rsmarples commented 2 years ago

Well, it only works once which is probably good enough.

morfikov commented 2 years ago

Actually it works all the time when thedhcpcd.service had been started before networking.service. This works only when I manually start both services one after the other. It doesn't really work at boot phase, and the bond0 interface gets no IP. I have really no idea what's wrong. I added to the dhcpcd.service file the following:

Requires=networking.service
Before=networking.service

So the dhcpcd.service starts before the networking.service , but it still gets no IP. :]

It looks like the two services don't really like each other and don't want to play together for some reason.

Except this randomise_hwaddr parameter, dhcpcd seems to work well on my system when using only networking.service. So maybe I just let it go.

rsmarples commented 2 years ago

Hmmm. That sounds related to #84. I don't have enough information to try and fix that. Can you add this to the top of /etc/dhcpcd.conf

logfile /var/log/dhcpcd.log
debug

And paste the dhcpcd.log file here please after booting. Hopefully it gives more information.

morfikov commented 2 years ago

Basically after booting, I got the following in the file:

May 08 22:39:49 [3453]: dhcpcd-9.4.1 starting
May 08 22:39:51 [3453]: chrooting as dhcpcd to /run/dhcpcd
May 08 22:39:51 [3453]: sandbox: seccomp
May 08 22:39:51 [3494]: spawned manager process on PID 3494
May 08 22:39:51 [3494]: spawned privileged proxy on PID 3495
May 08 22:39:51 [3494]: spawned network proxy on PID 3496
May 08 22:39:51 [3494]: spawned controller proxy on PID 3497
May 08 22:39:51 [3494]: DUID 00:04:d6:97:6e:01:51:46:11:cb:80:87:92:9c:ba:e3:d7:de
May 08 22:39:51 [3494]: bond0: executing: /usr/lib/dhcpcd/dhcpcd-run-hooks PREINIT
May 08 22:39:51 [3494]: bond0: executing: /usr/lib/dhcpcd/dhcpcd-run-hooks NOCARRIER
May 08 22:39:51 [3494]: no interfaces have a carrier
May 08 22:39:51 [3494]: bond0: hardware address randomised to ca:d4:0b:37:cc:1e
May 08 22:39:51 [3494]: bond0: waiting for carrier
May 08 22:40:04 [3497]: control_free: No such file or directory

And there's no IP. I also noticed that the networking service failed with the following errors:

May 08 22:39:52 morfikownia systemd[1]: Starting networking.service...
May 08 22:39:58 morfikownia ifup[3599]: RTNETLINK answers: Operation not supported
May 08 22:39:59 morfikownia ifup[3624]: Waiting for a max of 0 seconds for none to become available.
May 08 22:40:00 morfikownia ifup[3680]: RTNETLINK answers: Operation not supported
May 08 22:40:00 morfikownia ifup[3685]: Waiting for a max of 0 seconds for none to become available.
May 08 22:40:04 morfikownia ifup[3831]: sending commands to dhcpcd process
May 08 22:40:04 morfikownia ifup[3830]: Bad system call
May 08 22:40:04 morfikownia ifup[3506]: ifup: failed to bring up bond0
May 08 22:40:04 morfikownia systemd[1]: networking.service: Main process exited, code=exited, status=1/FAILURE
May 08 22:40:04 morfikownia systemd[1]: networking.service: Failed with result 'exit-code'.
May 08 22:40:04 morfikownia systemd[1]: Failed to start networking.service.

I also noticed the following audit entery in the log between sending commands to dhcpcd process and Bad system call :

May 08 22:40:04 morfikownia ifup[3831]: sending commands to dhcpcd process
May 08 22:40:04 morfikownia kernel: audit: type=1326 audit(1652042404.648:1018): auid=4294967295 uid=123 gid=65534 ses=4294967295 subj==dhcpcd (complain) pid=3831 comm="dhcpcd" exe="/usr/sbin/dhcpcd" sig=31 arch=c000003e syscall=262 compat=0 ip=0x79644060aaba code=0x0
May 08 22:40:04 morfikownia ifup[3830]: Bad system call

I also saw a few times the following errors:

May 08 23:09:50 morfikownia systemd[1]: Started dhcpcd.service.
May 08 23:09:50 morfikownia systemd[1]: Starting networking.service...
May 08 23:09:50 morfikownia ifup[32020]: /etc/network/if-pre-up.d/ifenslave: 39: echo: echo: I/O error
May 08 23:09:50 morfikownia kernel: bond0: option fail_over_mac: unable to set because the bond device has slaves
May 08 23:09:50 morfikownia ifup[32020]: /etc/network/if-pre-up.d/ifenslave: 39: echo: echo: I/O error
May 08 23:09:50 morfikownia kernel: bond0: option mode: unable to set because the bond device has slaves

The only way to fix this is to remove the bond0 interface via: ip link delete dev bond0 and then restart the networking.service.

That's all I've seen so far.

Also, here's my current config:

# egrep -v ^# /etc/dhcpcd.conf | egrep -v ^$
logfile /var/log/dhcpcd.log
debug
nohook hostname
duid
persistent
vendorclassid
option domain_name_servers, domain_name, domain_search
option classless_static_routes
option interface_mtu
option rapid_commit
require dhcp_server_identifier
slaac private
nohook resolv.conf
allowinterfaces bond0
noarp
anonymous
randomise_hwaddr
rsmarples commented 2 years ago

Can you test the above patch please? It should fix the bad audit entry and hopefully allow things to work.

morfikov commented 2 years ago

I tested the patch. Ifup returns the following now:

ifup[3699]: sending commands to dhcpcd process
ifup[3699]: send OK

But still no IP after boot.

When I now restart networking.service after boot, it manages to get the IP.

Here's the log after booting when it fails to get the IP:

May 09 14:49:05 [3374]: dhcpcd-9.4.1 starting
May 09 14:49:05 [3374]: chrooting as dhcpcd to /run/dhcpcd
May 09 14:49:05 [3374]: sandbox: seccomp
May 09 14:49:05 [3380]: spawned manager process on PID 3380
May 09 14:49:05 [3380]: spawned privileged proxy on PID 3381
May 09 14:49:05 [3380]: spawned network proxy on PID 3382
May 09 14:49:05 [3380]: spawned controller proxy on PID 3383
May 09 14:49:05 [3380]: DUID 00:04:d6:97:6e:01:51:46:11:cb:80:87:92:9c:ba:e3:d7:de
May 09 14:49:05 [3380]: bond0: executing: /usr/lib/dhcpcd/dhcpcd-run-hooks PREINIT
May 09 14:49:06 [3380]: bond0: executing: /usr/lib/dhcpcd/dhcpcd-run-hooks NOCARRIER
May 09 14:49:06 [3380]: no interfaces have a carrier
May 09 14:49:06 [3380]: bond0: waiting for carrier
May 09 14:49:17 [3380]: bond0: old hardware address: 76:9a:e2:65:2e:43
May 09 14:49:17 [3380]: bond0: new hardware address: 00:21:cc:c3:05:b0
May 09 14:49:17 [3699]: sending commands to dhcpcd process
May 09 14:49:17 [3699]: send OK
May 09 14:49:17 [3380]: control command: /sbin/dhcpcd -m 10 bond0
May 09 14:49:17 [3383]: control_free: No such file or directory

I think there are two separate issues here: one is no IP after booting when bonding is used (it works after restarting networking.service afterwards), and the other is that the randomise_hwaddr parameter doesn't work with bonding. I think they aren't related.

rsmarples commented 2 years ago

That looks fine - the control_free error is nonsence and can be ignored. It's gone from the next release.

So, to continue with the issue it looks like the bond0 interface has not reported carrier up or dhcpcd is missing it somehow. After it gets to this state, what is the output of ip link show dev bond0 ?

rsmarples commented 2 years ago

Could you repeat the test manually and have ip monitor running in a terminal and save the output too please? Maybe there is something in the carrier message causing dhcpcd to skip it.

morfikov commented 2 years ago

What test?

rsmarples commented 2 years ago

Hmmmm. Maybe setup ip monitor to start before dhcpcd somehow and log it? I just realised your test is to reboot which makes that problematic.

morfikov commented 2 years ago

Yes, it fails only while booting. I'll see what I can do with ip monitor .

morfikov commented 2 years ago

I think I've managed to catch what you wanted.

I started the system in the systemd's rescue mode and here are the interfaces the system has before playing with them (before launching dhcpcd.service and networking.service):

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether d2:28:eb:22:02:9d brd ff:ff:ff:ff:ff:ff
3: ifb0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 32
    link/ether 2e:44:d9:d6:ac:46 brd ff:ff:ff:ff:ff:ff
4: ifb1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 32
    link/ether 56:15:9b:f5:26:8c brd ff:ff:ff:ff:ff:ff
5: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 00:21:cc:c3:05:b0 brd ff:ff:ff:ff:ff:ff
6: wlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 60:67:20:42:56:9c brd ff:ff:ff:ff:ff:ff

Then I started in the TTY the following command:

# (ip monitor > ipmonit) &; disown
# exit

This allowed to run the ip monitor command in the background and the boot continued as usual.

This is what was recorded in the file:

2: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff
2: bond0: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff
inet br-docker forwarding off rp_filter loose mc_forwarding off proxy_neigh off ignore_routes_with_linkdown on
inet6 br-docker forwarding off mc_forwarding off proxy_neigh off ignore_routes_with_linkdown on
7: br-docker: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
    link/ether 8a:ce:a0:1e:d9:e2 brd ff:ff:ff:ff:ff:ff
7: br-docker: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 8a:ce:a0:1e:d9:e2 brd ff:ff:ff:ff:ff:ff
7: br-docker    inet 10.10.10.1/24 brd 10.10.10.255 scope global br-docker
       valid_lft forever preferred_lft forever
local 10.10.10.1 dev br-docker table local proto kernel scope host src 10.10.10.1
broadcast 10.10.10.255 dev br-docker table local proto kernel scope link src 10.10.10.1 dead linkdown
10.10.10.0/24 dev br-docker proto kernel scope link src 10.10.10.1 dead linkdown
inet virbr0 forwarding off rp_filter loose mc_forwarding off proxy_neigh off ignore_routes_with_linkdown on
inet6 virbr0 forwarding off mc_forwarding off proxy_neigh off ignore_routes_with_linkdown on
8: virbr0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
    link/ether 2e:1e:b9:4c:7f:53 brd ff:ff:ff:ff:ff:ff
8: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 2e:1e:b9:4c:7f:53 brd ff:ff:ff:ff:ff:ff
8: virbr0    inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
       valid_lft forever preferred_lft forever
local 192.168.122.1 dev virbr0 table local proto kernel scope host src 192.168.122.1
broadcast 192.168.122.255 dev virbr0 table local proto kernel scope link src 192.168.122.1 dead linkdown
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 dead linkdown
2: bond0: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN group default event BONDING OPTION
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff
2: bond0: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN group default event BONDING OPTION
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff
2: bond0: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN group default event BONDING OPTION
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff
2: bond0: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN group default event BONDING OPTION
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff
2: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff
2: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noqueue state DOWN group default event BONDING OPTION
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff
5: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff permaddr 00:21:cc:c3:05:b0
5: eth0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc pfifo_fast state DOWN group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff permaddr 00:21:cc:c3:05:b0
5: eth0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc pfifo_fast state DOWN group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff permaddr 00:21:cc:c3:05:b0
5: eth0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc pfifo_fast state DOWN group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff permaddr 00:21:cc:c3:05:b0
5: eth0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc pfifo_fast master bond0 state DOWN group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff permaddr 00:21:cc:c3:05:b0
5: eth0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc pfifo_fast master bond0 state DOWN group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff permaddr 00:21:cc:c3:05:b0
2: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noqueue state DOWN group default event FEATURE CHANGE
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff
2: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noqueue state DOWN group default event BONDING OPTION
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff
6: wlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff permaddr 60:67:20:42:56:9c
6: wlan0: <BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq state UNKNOWN group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff permaddr 60:67:20:42:56:9c
6: wlan0: <BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq state UNKNOWN group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff permaddr 60:67:20:42:56:9c
6: wlan0: <BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq state UNKNOWN group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff permaddr 60:67:20:42:56:9c
6: wlan0: <BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq master bond0 state UNKNOWN group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff permaddr 60:67:20:42:56:9c
6: wlan0: <BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq master bond0 state UNKNOWN group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff permaddr 60:67:20:42:56:9c
2: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noqueue state DOWN group default event FEATURE CHANGE
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff
2: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noqueue state DOWN group default event BONDING OPTION
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff
6: wlan0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq master bond0 state DOWN group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff permaddr 60:67:20:42:56:9c
6: wlan0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq master bond0 state DOWN group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff permaddr 60:67:20:42:56:9c
2: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noqueue state DOWN group default event BONDING OPTION
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff
2: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noqueue state DOWN group default event BONDING OPTION
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff
2: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noqueue state DOWN group default event BONDING OPTION
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff
5: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff permaddr 00:21:cc:c3:05:b0
6: wlan0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP>
    link/ether
6: wlan0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP>
    link/ether
6: wlan0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP>
    link/ether
6: wlan0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP>
    link/ether
6: wlan0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state DORMANT group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff permaddr 60:67:20:42:56:9c
6: wlan0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default
    link/ether 46:33:62:08:4d:54 brd ff:ff:ff:ff:ff:ff permaddr 60:67:20:42:56:9c

And after the boot proces was completed, there was no IP.

Does this help in anything? :)

rsmarples commented 2 years ago

Very much so! We can see that bond0 is not UP. So if you reboot and have no IP, does dhcpcd start to work if you do ip link set up dev bond0?

morfikov commented 2 years ago

Yes, after I typed the command it worked:

Here's the log after typing ip link set up dev bond0 in a terminal:

May 09 16:19:19 morfikownia kernel: bond0: (slave eth0): link status up, enabling it in 0 ms
May 09 16:19:19 morfikownia kernel: bond0: (slave wlan0): link status up, enabling it in 500 ms
May 09 16:19:19 morfikownia dhcpcd[3497]: bond0: carrier acquired
May 09 16:19:19 morfikownia kernel: bond0: (slave eth0): link status up, enabling it in 0 ms
May 09 16:19:19 morfikownia kernel: bond0: (slave wlan0): link status up, enabling it in 500 ms
May 09 16:19:19 morfikownia kernel: bond0: (slave eth0): link status definitely up, 1000 Mbps full duplex
May 09 16:19:19 morfikownia kernel: bond0: (slave eth0): making interface the new active one
May 09 16:19:19 morfikownia kernel: bond0: active interface up!
May 09 16:19:19 morfikownia kernel: bond0: (slave wlan0): invalid new link 3 on slave
May 09 16:19:19 morfikownia kernel: IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
May 09 16:19:19 morfikownia dhcpcd[3497]: bond0: executing: /usr/lib/dhcpcd/dhcpcd-run-hooks CARRIER
May 09 16:19:19 morfikownia dhcpcd[3497]: bond0: adding address fe80::89f:6156:8fb7:4a6f
May 09 16:19:19 morfikownia dhcpcd[3497]: bond0: pltime infinity, vltime infinity
May 09 16:19:19 morfikownia dhcpcd[3497]: bond0: delaying IPv6 router solicitation for 0.8 seconds
May 09 16:19:19 morfikownia dhcpcd[3497]: bond0: delaying IPv4 for 0.4 seconds
May 09 16:19:19 morfikownia dhcpcd[3497]: bond0: reading lease: /var/lib/dhcpcd/bond0.lease
May 09 16:19:19 morfikownia dhcpcd[3497]: bond0: soliciting a DHCP lease
May 09 16:19:19 morfikownia dhcpcd[3497]: bond0: sending DISCOVER (xid 0x5818f125), next in 3.2 seconds
May 09 16:19:19 morfikownia dhcpcd[3497]: bond0: spawned BPF BOOTP on PID 7586
May 09 16:19:19 morfikownia kernel: bond0: (slave wlan0): link status definitely up, 0 Mbps full duplex
May 09 16:19:20 morfikownia dhcpcd[3497]: bond0: soliciting an IPv6 router
May 09 16:19:20 morfikownia dhcpcd[3497]: bond0: delaying Router Solicitation for LL address
May 09 16:19:20 morfikownia dhcpcd[3497]: bond0: sending Router Solicitation
May 09 16:19:20 morfikownia dhcpcd[3497]: bond0: Router Advertisement from fe80::ee08:6bff:fe84:15d5
May 09 16:19:20 morfikownia dhcpcd[3497]: bond0: adding address fd80:24b5:db14:0:aa94:732a:9e1e:31db/64
May 09 16:19:20 morfikownia dhcpcd[3497]: bond0: pltime infinity, vltime infinity
May 09 16:19:20 morfikownia dhcpcd[3497]: bond0: adding route to fd80:24b5:db14::/64
May 09 16:19:20 morfikownia dhcpcd[3497]: bond0: waiting for Router Advertisement DAD to complete
May 09 16:19:20 morfikownia dhcpcd[3497]: bond0: reading lease: /var/lib/dhcpcd/bond0.lease6
May 09 16:19:20 morfikownia dhcpcd[3497]: bond0: soliciting a DHCPv6 lease
May 09 16:19:20 morfikownia dhcpcd[3497]: bond0: delaying SOLICIT6 (xid 0x2ecd5d), next in 1.1 seconds
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: broadcasting SOLICIT6 (xid 0x2ecd5d), next in 1.0 seconds
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: SOL_MAX_RT 3600 -> 60
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: ADV fd80:24b5:db14::22a/128 from fe80::ee08:6bff:fe84:15d5
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: broadcasting REQUEST6 (xid 0xa4a6c0), next in 1.0 seconds
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: REPLY6 received from fe80::ee08:6bff:fe84:15d5
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: adding address fd80:24b5:db14::22a/128
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: pltime 43200 seconds, vltime 43200 seconds
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: fd80:24b5:db14::a16/128: became stale
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: renew in 21600, rebind in 34560, expire in 43200 seconds
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: writing lease: /var/lib/dhcpcd/bond0.lease6
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: waiting for DHCPv6 DAD to complete
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: Router Advertisement DAD completed
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: executing: /usr/lib/dhcpcd/dhcpcd-run-hooks ROUTERADVERT
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: sending NA for fd80:24b5:db14:0:aa94:732a:9e1e:31db/64
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: offered 192.168.1.217 from 192.168.1.1
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: sending REQUEST (xid 0x5818f125), next in 3.7 seconds
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: acknowledged 192.168.1.217 from 192.168.1.1
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: leased 192.168.1.217 for 43200 seconds
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: renew in 21600 seconds, rebind in 37800 seconds
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: writing lease: /var/lib/dhcpcd/bond0.lease
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: adding IP address 192.168.1.217/24 broadcast 192.168.1.255
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: adding route to 192.168.1.0/24
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: adding default route via 192.168.1.1
May 09 16:19:22 morfikownia dhcpcd[3497]: bond0: executing: /usr/lib/dhcpcd/dhcpcd-run-hooks BOUND
May 09 16:19:23 morfikownia dhcpcd[3497]: bond0: sending NA for fd80:24b5:db14:0:aa94:732a:9e1e:31db/64
May 09 16:19:23 morfikownia dhcpcd[3497]: bond0: sending NA for fd80:24b5:db14::22a/128
May 09 16:19:23 morfikownia dhcpcd[3497]: bond0: DHCPv6 DAD completed
May 09 16:19:23 morfikownia dhcpcd[3497]: bond0: executing: /usr/lib/dhcpcd/dhcpcd-run-hooks BOUND6
May 09 16:19:24 morfikownia dhcpcd[3497]: bond0: sending NA for fd80:24b5:db14:0:aa94:732a:9e1e:31db/64
May 09 16:19:24 morfikownia dhcpcd[3497]: bond0: sending NA for fd80:24b5:db14::22a/128
May 09 16:19:25 morfikownia dhcpcd[3497]: bond0: sending NA for fd80:24b5:db14::22a/128

So the question is why the interface isn't UP automatically? It also looks like the randomise_hwaddr worked this time (with wire+wifi both on).

rsmarples commented 2 years ago

I think the real question is what brought the interface DOWN? We can see from your initial logs that dhcpcd (or something else) does indeed bring the interface up and there is no code in dhcpcd itself to bring it back down again.

So I would guess the modifing the bond interface interface such as adding slaves brought it down. You could try adding the up command at the end of your bond setup in /etc/network/interfaces so it looks like this:

iface bond0 inet dhcp
    metric 10
    bond-mode active-backup
    bond-miimon 100
    bond-downdelay 500
    bond-updelay 500
    bond-primary eth0
    bond-primary-reselect always
    bond-fail-over-mac none
    bond-slaves eth0 wlan0
    bond-min-links 1
    up
morfikov commented 2 years ago

I added the following to the bond0 stanza in /etc/network/interfaces (since up alone caused errors):

up ip link set up dev $IFACE

and now after booting, everything seems to work as it should -- the bond0 interface gets an IP address.

The randomise_hwaddr parametr also seems to work now, since with every boot I get a different IP.

It's really weird that the line has to be added to make it work.

rsmarples commented 2 years ago

Generally the last action on interface setup is an up. dhcpcd doesn't know what the last action is, so if something takes it down something else needs to bring it up.

Anyway, we might be able to make dhcpcd -n $interface commands received by the master process bring the interface up as well. If you had iface bond0 static or none or whatever command doesn't start dhcp then I guess you would still need the up directive manually?

Anyway, I'm away for a week or so now so no more patches right away.

morfikov commented 2 years ago

The bonding setup I currently have has been working well for a very long time. It was working well with dhclient. It is working well with dhcpcd when dhcpcd isn't started as a separate service (it''s called by the networking.service the same dhclient was). The problem with this DOWN state in the case of bonding started to show when I enabled dhcpcd.service and ordered it before networking.service . So apparently the two services have issues when they are working together. But I don't know whether is a fault of dhcpcd or maybe something is wrong with the ifupdown scripts.

haruue commented 2 years ago

The commit b4d8a349e55d269866bc6ee6e3bb3b03ef293eda also fixes SIGSYS when dhcpcd -U (or dhcpcd --dumplease) with Glibc 2.35 Stacktrace: https://fars.ee/xZGf Seems like the printf(3) may finally calls newfstatat(2), also noted _syscall = 262 in the $_siginfo. Encountered this signal on the Arch Linux, with core/dhcpcd 9.4.1-1 and core/glibc 2.35-5.