Open encbladexp opened 6 years ago
/cc @ssahani
If some network set Bridge=vbr0
, then the link goes to 'configured' state. That is, e.g.,
# /etc/systemd/network/10-eth0.network
[Match]
Name=eth0
[Network]
Bridge=vbr0
At least currently, it seems that if no network specifies vbr0 in Bridge=
then the bridge will never be in configured state...
So i need to configure an unused, not required, network interface. I don't think this is the way it should work ;)
(Bridges created with brctl don't have this beavior.)
My above comment is just a 'workaround'. I could not find why the behaviors between brctl and networkd are different...
I am trying to figure this out till netdev creation everything is same but will make this up via .network we have a difference.
Up. Can it be resolved with https://github.com/systemd/systemd/pull/9956 ?
@MrSorcus i don't think so.
I am also having this problem (version 241). The networkctl state stays at no-carrier (configuring)
even though ConfigureWithoutCarrier=yes
was used in the .network file. Is there a known workaround?
@Rapsey Please try with disabling ipv6 link local address, that is, LinkLocalAddressing=no
. At least, with current git master, it works fine with the setting.
@Rapsey Or, please try PR #12794 if possible. Thank you.
Thank you for the quick reply and fix @yuwata ! I have tested your PR on Debian 9.9 and now the link goes into a no-carrier (configured)
state.
Curiously now I can't even "fix" the bridge by recreating it manually. Before this if I recreated the bridge using ip link del
and brctl addbr
as suggested in the original post, the bridge would be created and configured but without the NO-CARRIER flag. Now all bridges keep getting NO-CARRIER even when created that way.
EDIT: After downgrading back to systemd 241 the bridges still go into NO-CARRIER even when created manually. So I suspect this was not caused by systemd directly but by updating some dependencies necessary to build from source.
BTW, I cannot reproduce the original issue, no-carrier state, anymore with kernel-5.1.8-200.fc29.x86_64 and current systemd git master. Can I close this issue? @ssahani WDYT?
The original issue happens even the bridge interface is not managed by networkd.
$ sudo ip link add bridge99 type bridge
$ sudo ip link set bridge99 up
Then, the link will be in NO-CARRIER
state.
It seems to depend on something else. The following was done on a VM with the exact same OS & kernel: Debian 9.9 with kernel 4.9.0-9-amd64 (4.9.168-1+deb9u2). No NO-CARRIER there.
root@anansi:~# ip link add bridge99 type bridge
root@anansi:~# ip link set bridge99 up
root@anansi:~# ip link show bridge99
3: bridge99: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether be:42:10:4a:d7:7a brd ff:ff:ff:ff:ff:ff
Logically, I'd expect a bridge with no interfaces to be in NO-CARRIER state. What does it even mean that the bridge has "carrier" when it has no interfaces?
But anyway, with systemd-242-895-g9e93200+ and kernel-5.0.7-300.fc30.x86_64, I do get "carrier" with both manual ip link add
and systemd-networkd
. We need to figure out what is causing those divergent results before we can fix this.
I'm afraid I won't be able to continue investigating this, but here's what I found. Maybe it can help someone else in the future.
On a clean Debian stretch system bridges created with ip link
did not get the NO-CARRIER state. I then installed the dependencies necessary to build systemd from source. To get the right versions I had to pull 2 packages from testing: util-linux
and libmount-dev
.
This resulted in the following packages being installed from testing:
libblkid-dev 2.33.1-0.1
libblkid1 2.33.1-0.1
libc-bin 2.28-10
libc-dev-bin 2.28-10
libc-l10n 2.28-10
libc6 2.28-10
libc6-dev 2.28-10
libc6-dev-i386 2.28-10
libc6-dev-x32 2.28-10
libc6-i386 2.28-10
libc6-x32 2.28-10
libcap-ng0 0.7.9-2
libfdisk1 2.33.1-0.1
libncursesw6 6.1+20181013-2
libsmartcols1 2.33.1-0.1
libtinfo6 6.1+20181013-2
libuuid1 2.33.1-0.1
locales 2.28-10
uuid-dev 2.33.1-0.1
After this, all interfaceless bridges created with ip link
got the NO-CARRIER state.
@keszybz sometime you need to create a bridge without interfaces, for example to bind virtual machines or dynamic interfaces to it afterwards.
Anyway, this is not caused by networkd
. You can easily confirm this 'issue' even if networkd is stopped.
And @keszybz's comment below makes sense for me.
Logically, I'd expect a bridge with no interfaces to be in NO-CARRIER state. What does it even mean that the bridge has "carrier" when it has no interfaces?
I'd like to close this. @keszybz and @ssahani WDYT?
I agree with @keszybz.
@keszybz sometime you need to create a bridge without interfaces, for example to bind virtual machines or dynamic interfaces to it afterwards.
Yes, I know this. I don't have have any issue with a bridge device without enslaved interfaces. "no carrier" means that the interface is there, we may even configure addresses and routes on it, but if we send packets, they won't reach anyone, and we will not receive any packets also. And a bridge without interfaces is exactly like that.
I tried to follow the carrier logic in the kernel, but I couldn't figure it out. I see br_port_carrier_check()
and br_device_event()
, which seem to take care of changes where devices are added, but I don't see what determines the state before any devices are added. Pointers would be very welcome.
I'd prefer to keep this open until we figure out what is going on here.
I've noticed that a bridge will initially be in the NO-CARRIER state if spanning tree is enabled, until the real interface has completed the listening and learning phases.
If nmcli -g bridge.stp con show br0
shows "yes" then STP is enabled.
From dmesg:
[ 13.426970] bnx2 0000:01:00.0 eno1: NIC Copper Link is Up, 1000 Mbps full duplex
[ 13.426976] , receive & transmit flow control ON
[ 13.427063] IPv6: ADDRCONF(NETDEV_CHANGE): eno1: link becomes ready
[ 13.430757] br0: port 1(eno1) entered blocking state
[ 13.430759] br0: port 1(eno1) entered disabled state
[ 13.430809] device eno1 entered promiscuous mode
[ 13.430848] br0: port 1(eno1) entered blocking state
[ 13.430849] br0: port 1(eno1) entered listening state
[ 13.506888] bnx2 0000:01:00.1 eno2: NIC Copper Link is Up, 1000 Mbps full duplex
[ 13.506893] , receive & transmit flow control ON
[ 13.506999] IPv6: ADDRCONF(NETDEV_CHANGE): eno2: link becomes ready
[ 28.628042] br0: port 1(eno1) entered learning state
[ 43.732041] br0: port 1(eno1) entered forwarding state
[ 43.732049] br0: topology change detected, propagating
Before br0: topology change
:
9: br0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
link/ether 78:2b:cb:13:6c:f4 brd ff:ff:ff:ff:ff:ff
And after:
9: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 78:2b:cb:13:6c:f4 brd ff:ff:ff:ff:ff:ff
The preceeding comment from @ChetHosey helped me. To get connectivity back, I simply disabled STP and waited a bit.
AFAIK a bridge should never be empty, it is not fully setup if empty. In cases interfaces get dynamically added/removed to a bridge, there should be at least one remaining. You can do this with a dummy interface always belonging to the bridge (libvirt does this, virbr0-nic dummy interface).
(Would it maybe be useful to have an netdev option "DummyDevice=..." in the bridge section that automatically does that??)
A bridge inherits the MAC address of its first device (or used to inherit? I was not able to reproduce it), when all devices are removed and a new one is added, it would change its MAC. This can lead to issues.
So I don't know if this is a legacy problem that has changed with recent kernels or if there are still good reasons to not have an empty bridge.
@ganguin, see https://github.com/systemd/systemd/issues/9252#issuecomment-502557216 re bridges with no interfaces.
Bridge MAC is generated from the bridge name, and not from any devices that are attached to it. We changed that in systemd 241. See https://www.freedesktop.org/software/systemd/man/systemd.net-naming-scheme.html#v241.
I think I encountered this issue ...
My bridges are in NO CARRIER state.
[root@pm network]# networkctl
IDX LINK TYPE OPERATIONAL SETUP
1 lo loopback carrier unmanaged
2 enp3s0f0 ether routable configured
3 enp3s0f1 ether off unmanaged
4 enp4s0f0 ether off unmanaged
5 enp4s0f1 ether off unmanaged
6 10.20.0.x bridge no-carrier configuring
7 10.20.25.x bridge no-carrier configuring
8 tun-hostnet none routable unmanaged
9 20-0-13 ether degraded unmanaged
10 20-0-14 ether degraded unmanaged
11 20-0-12 ether degraded unmanaged
12 20-25-2 ether degraded unmanaged
13 20-0-15 ether degraded unmanaged
Added ConfigureWithoutCarrier=yes
for now .. but it used to work without that option.
I attach only nspawn containers ...
This issue also caused systemd-networkd-wait-online
to timeout at boot.
Completely different setup, but the same symptoms. Industrial device, bridging two ports in DSA, using KSZ8563.
I was able to track this down to a single udev
file, 99-default.link
:
root@host:/lib/systemd/network# cat 99-default.link.bak
# SPDX-License-Identifier: LGPL-2.1+
#
# This file is part of systemd.
#
# systemd is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2.1 of the License, or
# (at your option) any later version.
[Match]
OriginalName=*
[Link]
NamePolicy=keep kernel database onboard slot path
MACAddressPolicy=persistent
Linux 5.10
systemd
version 244, commit: 3ceaa81c61b654ebf562464d142675bd4d57d7b6
, Yocto Dunfell, custom distro
Patches applied are listed here: http://cgit.openembedded.org/openembedded-core/tree/meta/recipes-core/systemd/systemd_244.5.bb?h=dunfell#n17
Their content can be found here: http://cgit.openembedded.org/openembedded-core/tree/meta/recipes-core/systemd/systemd?h=dunfell
After further debugging, it's specifically the MACAddressPolicy=persistent
line that causes the issue for me.
Similarly, adding Type=!bridge
in [Match]
made it work. The only issue is, I do not have a persistent MAC address for my device.
FWIW: After upgrading from Ubuntu 20.04 to 22.04 same problem occurs. Fixed it with /etc/systemd/network/10-bridges.link
:
[Match]
Type=bridge
[Link]
MACAddressPolicy=none
Since @jelmd provided the fix I found for this issue I am commenting here. Yea it works! In my use case I was using link files to rename interfaces to wan and lan using MAC and maybe that's why this bridge link file is required. I made sure to put it lexically before the other interface link files
If you want to get a bridge without connected interfaces working under systemd 255+ you can try adding "MACAddress=none" to the [NetDev] section of your .netdev file.
I also have bridges excluded from the 99-default.link so MACAddressPolicy=persistent does not take effect by specifying "Type=!bridge"
From the docs:
MACAddress= Specifies the MAC address to use for the device, or takes the special value "none". When "none", systemd-networkd does not request the MAC address for the device, and the kernel will assign a random MAC address. For "tun", "tap", or "l2tp" devices, the MACAddress= setting in the [NetDev] section is not supported and will be ignored. Please specify it in the [Link] section of the corresponding systemd.network(5) file. If this option is not set, "vlan" device inherits the MAC address of the master interface. For other kind of netdevs, if this option is not set, then the MAC address is generated based on the interface name and the machine-id(5).
Note, even if "none" is specified, systemd-udevd will assign the persistent MAC address for the device, as 99-default.link has MACAddressPolicy=persistent. So, it is also necessary to create a custom .link file for the device, if the MAC address assignment is not desired.
systemd version the issue has been seen with
Used distribution
Expected behaviour you didn't see
Unexpected behaviour you saw
Steps to reproduce the problem
10-vbr0.netdev:
10-vbr0.network
If i delete vbr0 with
ip link delete vbr0
and create it withbrctl addbr vbr0
systemd-networkd configures the interface with settings from the .network file.Her the output of
ip link show
:Any ideas?