thom311 / libnl

Netlink Library Suite
GNU Lesser General Public License v2.1
419 stars 311 forks source link

bridge object in link cache may get replaced by an incomplete object #377

Open KanjiMonster opened 4 months ago

KanjiMonster commented 4 months ago

I observed the link cache sometimes returning an incomplete object of family AF_BRIDGE, but with no link type set (and thus any rtnl_linkbridge* call triggering an assert).

This happens shortly after creation and attaching a port to a bridge.

Running with NLCB=debug, the following messages can be seen:

complete bridge object ("good" kernel message) => link cache returns a full object

-- Debug: Received Message: 
--------------------------   BEGIN NETLINK MESSAGE ---------------------------
  [NETLINK HEADER] 16 octets
    .nlmsg_len = 1864 
    .type = 16 <route/link::new>
    .flags = 0 <>
    .seq = 0 
    .port = 0
  [PAYLOAD] 16 octets 
    00 00 01 00 40 00 00 00 03 10 00 00 00 00 00 00 ....@...........
  [ATTR 03] 9 octets
    73 77 62 72 69 64 67 65 00                      swbridge.
  [PADDING] 3 octets
    00 00 00                                        ...
  [ATTR 13] 4 octets
    e8 03 00 00                                     ....
  [ATTR 16] 1 octets
    02                                              .
  [PADDING] 3 octets
    00 00 00                                        ...
  [ATTR 17] 1 octets
    00                                              .
  [PADDING] 3 octets
    00 00 00                                        ...
  [ATTR 04] 4 octets
    dc 05 00 00                                     ....
  [ATTR 50] 4 octets
    44 00 00 00                                     D...
  [ATTR 51] 4 octets
    ff ff 00 00                                     ....
  [ATTR 27] 4 octets
    00 00 00 00                                     ....
  [ATTR 30] 4 octets
    00 00 00 00                                     ....
  [ATTR 61] 4 octets
    00 00 00 00                                     ....
  [ATTR 31] 4 octets
    01 00 00 00                                     ....
  [ATTR 40] 4 octets
    ff ff 00 00                                     ....
  [ATTR 41] 4 octets
    00 00 01 00                                     ....
  [ATTR 58] 4 octets
    00 00 01 00                                     ....
  [ATTR 63] 4 octets
    00 00 01 00                                     ....
  [ATTR 64] 4 octets
    00 00 01 00                                     ....
  [ATTR 59] 4 octets
    00 00 01 00                                     ....
  [ATTR 60] 4 octets
    ff ff 00 00                                     ....
  [ATTR 32] 4 octets
    01 00 00 00                                     ....
  [ATTR 33] 1 octets
    00                                              .
  [PADDING] 3 octets
    00 00 00                                        ...
  [ATTR 06] 8 octets
    6e 6f 71 75 65 75 65 00                         noqueue.
  [ATTR 35] 4 octets
    01 00 00 00                                     ....
  [ATTR 47] 4 octets
    00 00 00 00                                     ....
  [ATTR 48] 4 octets
    01 00 00 00                                     ....
  [ATTR 39] 1 octets
    00                                              .
  [PADDING] 3 octets
    00 00 00                                        ...
  [ATTR 44] 4 octets
    02 00 00 00                                     ....
  [ATTR 14] 32 octets 
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
  [ATTR 01] 6 octets
    da 56 2f bc 9f 33                               .V/..3
  [PADDING] 2 octets
    00 00                                           ..
  [ATTR 02] 6 octets
    ff ff ff ff ff ff                               ......
  [PADDING] 2 octets
    00 00                                           ..
  [ATTR 23] 200 octets
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00                         ........
  [ATTR 07] 96 octets
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
  [ATTR 43] 8 octets
    05 00 02 00 00 00 00 00                         ........ 
  [ATTR 18] 424 octets
    0b 00 01 00 62 72 69 64 67 65 00 00 9c 01 02 00 ....bridge......
    0c 00 10 00 c5 00 00 00 00 00 00 00 0c 00 11 00 ................
    00 00 00 00 00 00 00 00 0c 00 12 00 00 00 00 00 ................
    00 00 00 00 0c 00 13 00 07 00 00 00 00 00 00 00 ................
    08 00 01 00 dc 05 00 00 08 00 02 00 c8 00 00 00 ................
    08 00 03 00 d0 07 00 00 08 00 04 00 30 75 00 00 ............0u..
    08 00 05 00 01 00 00 00 06 00 06 00 00 80 00 00 ................
    05 00 07 00 01 00 00 00 06 00 09 00 00 00 00 00 ................
    0c 00 0b 00 80 00 da 56 2f bc 9f 33 0c 00 0a 00 .......V/..3....
    80 00 da 56 2f bc 9f 33 06 00 0c 00 00 00 00 00 ...V/..3........
    08 00 0d 00 00 00 00 00 05 00 0e 00 00 00 00 00 ................
    05 00 0f 00 00 00 00 00 0a 00 14 00 01 80 c2 00 ................
    00 00 00 00 0c 00 2e 00 00 00 00 00 07 00 00 00 ................
    06 00 08 00 81 00 00 00 06 00 27 00 00 00 00 00 ..........'.....
    05 00 29 00 00 00 00 00 05 00 2d 00 00 00 00 00 ..).......-.....
    05 00 16 00 01 00 00 00 05 00 17 00 01 00 00 00 ................
    05 00 18 00 00 00 00 00 05 00 19 00 00 00 00 00 ................
    05 00 2a 00 00 00 00 00 08 00 1a 00 10 00 00 00 ..*.............
    08 00 1b 00 00 10 00 00 08 00 1c 00 02 00 00 00 ................
    08 00 1d 00 02 00 00 00 05 00 2b 00 02 00 00 00 ..........+.....
    05 00 2c 00 01 00 00 00 0c 00 1e 00 64 00 00 00 ..,.........d...
    00 00 00 00 0c 00 1f 00 90 65 00 00 00 00 00 00 .........e......
    0c 00 20 00 9c 63 00 00 00 00 00 00 0c 00 21 00 .. ..c........!.
    d4 30 00 00 00 00 00 00 0c 00 22 00 e8 03 00 00 .0........".....
    00 00 00 00 0c 00 23 00 35 0c 00 00 00 00 00 00 ......#.5.......
    05 00 24 00 00 00 00 00 05 00 25 00 00 00 00 00 ..$.......%.....
    05 00 26 00 00 00 00 00                         ..&.....
  [ATTR 26] 800 octets
    8c 00 02 00 88 00 01 00 01 00 00 00 00 00 00 00 ................
    00 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 ................
    01 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    10 27 00 00 e8 03 00 00 00 00 00 00 00 00 00 00 .'..............
    00 00 00 00 00 00 00 00 01 00 00 00 94 02 0a 00 ................
    08 00 01 00 00 00 00 00 14 00 05 00 ff ff 00 00 ................
    23 1c 00 00 5c 83 00 00 e8 03 00 00 f0 00 02 00 #...\...........
    01 00 00 00 40 00 00 00 dc 05 00 00 01 00 00 00 ....@...........
    01 00 00 00 01 00 00 00 01 00 00 00 ff ff ff ff ................
    a0 0f 00 00 e8 03 00 00 00 00 00 00 80 3a 09 00 .............:..
    80 51 01 00 03 00 00 00 58 02 00 00 10 00 00 00 .Q......X.......
    00 00 00 00 01 00 00 00 01 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 ................
    00 00 00 00 00 00 00 00 10 27 00 00 e8 03 00 00 .........'......
    01 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 ................
    00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 80 ee 36 00 ..............6.
    00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 04 00 00 00 00 00 00 ff ff 00 00 ff ff ff ff ................
    01 00 00 00 00 00 00 00 00 00 00 00 2c 01 03 00 ............,...
    25 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 %...............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 3c 00 06 00 07 00 00 00 ........<.......
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 14 00 07 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 05 00 08 00 00 00 00 00 ................
  [ATTR 62 NESTED] 0 octets
---------------------------  END NETLINK MESSAGE   ---------------------------

but shortly after that the kernel sends a short message with incomplete bridge state:

-- Debug: Received Message:
--------------------------   BEGIN NETLINK MESSAGE ---------------------------  
  [NETLINK HEADER] 16 octets
    .nlmsg_len = 96
    .type = 16 <route/link::new>
    .flags = 0 <>
    .seq = 0
    .port = 0
  [PAYLOAD] 16 octets
    07 00 01 00 40 00 00 00 03 10 00 00 00 00 00 00 ....@...........
  [ATTR 03] 9 octets
    73 77 62 72 69 64 67 65 00                      swbridge.
  [PADDING] 3 octets
    00 00 00                                        ...
  [ATTR 10] 4 octets
    40 00 00 00                                     @...
  [ATTR 04] 4 octets
    dc 05 00 00                                     ....
  [ATTR 16] 1 octets
    02                                              .  
  [PADDING] 3 octets
    00 00 00                                        ...
  [ATTR 01] 6 octets
    da 56 2f bc 9f 33                               .V/..3
  [PADDING] 2 octets
    00 00                                           .. 
  [ATTR 26] 8 octets
    08 00 02 00 00 00 01 00                         ........ 
---------------------------  END NETLINK MESSAGE   ---------------------------

Note how [ATTR 26] is basically empty, and l_family is set to 07 / AF_BRIDGE.

While the first message has l_family set 00 / AF_UNSPEC, libnl changes this to 07 / AF_BRIDGE based on the IFLA_INFO_KIND "bridge" in [ATTR 18].

This then causes the previous full bridge object in the link cache to be replaced with an empty one based on the short message.

Not sure what the correct course of action here is. Ignore the short message? Update the bridge object instead of replacing it? If we wouldn't replace the l_family based on the value in IFLA_INFO_KIND, both objects would co-exist and there would be no issue, but likely not doing that would now break the API and users.

FWIW, this is libnl 3.8 with kernel 6.6, though the behavior also exists in 6.1.

KanjiMonster commented 4 months ago

Grepping through the code, I guess this message comes from rtnl_bridge_notify(), which only sends a minimal message:

static inline size_t bridge_nlmsg_size(void)
{
    return NLMSG_ALIGN(sizeof(struct ifinfomsg))
        + nla_total_size(IFNAMSIZ)  /* IFLA_IFNAME */
        + nla_total_size(MAX_ADDR_LEN)  /* IFLA_ADDRESS */
        + nla_total_size(sizeof(u32))   /* IFLA_MASTER */
        + nla_total_size(sizeof(u32))   /* IFLA_MTU */
        + nla_total_size(sizeof(u32))   /* IFLA_LINK */
        + nla_total_size(sizeof(u32))   /* IFLA_OPERSTATE */
        + nla_total_size(sizeof(u8))    /* IFLA_PROTINFO */
        + nla_total_size(sizeof(struct nlattr)) /* IFLA_AF_SPEC */
        + nla_total_size(sizeof(u16))   /* IFLA_BRIDGE_FLAGS */
        + nla_total_size(sizeof(u16));  /* IFLA_BRIDGE_MODE */
}

I guess the proper solution would be to have a oo_update() that can merge this message into the existing object?

KanjiMonster commented 4 months ago

Grepping through the code, I guess this message comes from rtnl_bridge_notify(), which only sends a minimal message:

static inline size_t bridge_nlmsg_size(void)
{
  return NLMSG_ALIGN(sizeof(struct ifinfomsg))
      + nla_total_size(IFNAMSIZ)  /* IFLA_IFNAME */
      + nla_total_size(MAX_ADDR_LEN)  /* IFLA_ADDRESS */
      + nla_total_size(sizeof(u32))   /* IFLA_MASTER */
      + nla_total_size(sizeof(u32))   /* IFLA_MTU */
      + nla_total_size(sizeof(u32))   /* IFLA_LINK */
      + nla_total_size(sizeof(u32))   /* IFLA_OPERSTATE */
      + nla_total_size(sizeof(u8))    /* IFLA_PROTINFO */
      + nla_total_size(sizeof(struct nlattr)) /* IFLA_AF_SPEC */
      + nla_total_size(sizeof(u16))   /* IFLA_BRIDGE_FLAGS */
      + nla_total_size(sizeof(u16));  /* IFLA_BRIDGE_MODE */
}

Scratch that, looking at [ATTR 26] (= IFLA_AF_SPEC), it contains a [ATTR 08] (= IFLA_BRIDGE_VLAN_INFO).

This implies this comes from br_info_notify():

void br_ifinfo_notify(int event, const struct net_bridge *br,
              const struct net_bridge_port *port)
{
    u32 filter = RTEXT_FILTER_BRVLAN_COMPRESSED;

    return br_info_notify(event, br, port, filter);
}

which sends an update with just the VLAN info filled. And no IFLA_LINKINFO, but ifi_family set to AF_BRIDGE.

thom311 commented 4 months ago

I guess the proper solution would be to have a oo_update() that can merge this message into the existing object?

That sounds right.