FRRouting / frr

The FRRouting Protocol Suite
https://frrouting.org/
Other
3.21k stars 1.24k forks source link

NHRP Resolution Request has duplicate NHRP Authentication Extension field #16466

Closed aapostoliuk closed 1 month ago

aapostoliuk commented 1 month ago

Description

After the fix https://github.com/FRRouting/frr/pull/16422, Hub sends NHRP Resolution Request packet with a duplicate NHRP Authentication Extension field. One of these fields has Extension Length = 0. Cisco Router (DMVPN SPOKE) does not like it. It generates an error NHRP-ERROR: Incorrect Auth extn length 0 and sends an Error Indication packet. NHRP: Sending error indication. Reason: 'Pak sanity failure' LINE: 9877 The described network with dumps and debugs is in a closed bug report https://github.com/FRRouting/frr/issues/16371 This issue exists only when NHRP Authentication is configured.

Version

vyos# show ver
FRRouting 10.2-dev (vyos) on Linux(6.6.36-amd64-vyos).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-option-checking' '--disable-silent-rules' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--libexecdir=${prefix}/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--sbindir=/usr/lib/frr' '--with-vtysh-pager=/usr/bin/pager' '--libdir=/usr/lib/x86_64-linux-gnu/frr' '--with-moduledir=/usr/lib/x86_64-linux-gnu/frr/modules' '--disable-dependency-tracking' '--enable-rpki' '--enable-scripting' '--enable-pim6d' '--disable-grpc' '--with-libpam' '--enable-doc' '--enable-doc-html' '--enable-snmp' '--enable-fpm' '--disable-protobuf' '--disable-zeromq' '--enable-ospfapi' '--enable-bgp-vnc' '--enable-multipath=256' '--enable-user=frr' '--enable-group=frr' '--enable-vty-group=frrvty' '--enable-configfile-mask=0640' '--enable-logfile-mask=0640' 'build_alias=x86_64-linux-gnu' 'PYTHON=python3'

How to reproduce

The described network with dumps and debugs is in a closed bug report https://github.com/FRRouting/frr/issues/16371

Expected behavior

SPOKES must have direct conversations.

Actual behavior

SPOKES cannot send packets directly. FRR DMVPN HUB debug

2024-07-22 09:21:05.096 [DEBG] nhrpd: [K0534-5VD2M] PACKET: Recv 192.168.100.111 -> 192.168.100.100
2024-07-22 09:21:05.096 [DEBG] nhrpd: [PTQ80-8JY6C] Recv Resolution-Request(1) 10.0.0.11 -> 10.0.103.2
2024-07-22 09:21:05.096 [DEBG] nhrpd: [RHB3H-QNGNH] Processing Authentication Extension for (test123:test123|0)
2024-07-22 09:21:05.096 [DEBG] nhrpd: [KNPB6-NP2Y4] lookup 10.0.103.2/32: zebra route dev tun100
2024-07-22 09:21:05.096 [DEBG] nhrpd: [GVZF0-990Z5] lookup 10.0.0.13/32: nhrp_if=tun100
2024-07-22 09:21:05.096 [DEBG] nhrpd: [M78NA-AFP11] Processing NHRP_EXTENSION_NAT_ADDRESS while forwarding the request packet
2024-07-22 09:21:05.096 [DEBG] nhrpd: [RFX78-JMH2T] Proto is 10.0.0.11
2024-07-22 09:21:05.096 [DEBG] nhrpd: [MFKFP-TR5FR] c->cur.remote_nbma_natoa is (unspec)
2024-07-22 09:21:05.096 [DEBG] nhrpd: [PTQ80-8JY6C] Send Resolution-Request(1) 10.0.0.11 -> 10.0.103.2
2024-07-22 09:21:05.096 [DEBG] nhrpd: [WSA6E-5GM0H] PACKET: Send 192.168.100.100 -> 192.168.100.13
2024-07-22 09:21:05.103 [DEBG] nhrpd: [K0534-5VD2M] PACKET: Recv 192.168.100.13 -> 192.168.100.100
2024-07-22 09:21:05.103 [DEBG] nhrpd: [PTQ80-8JY6C] Recv Error-Indication(7) 10.0.0.13 -> 10.0.0.11
2024-07-22 09:21:05.103 [DEBG] nhrpd: [PTQ80-8JY6C] Send Error-Indication(7) 10.0.0.11 -> 10.0.0.13
2024-07-22 09:21:05.103 [DEBG] nhrpd: [WSA6E-5GM0H] PACKET: Send 192.168.100.100 -> 192.168.100.13
2024-07-22 09:21:05.103 [INFO] nhrpd: [PRQ0A-R3YY1] From 192.168.100.13: error: authentication failure
2024-07-22 09:21:06.549 [DEBG] nhrpd: [TPNQ6-77EJG] Netlink-mcast-log: Received msg_type 1024, msg_flags 0
2024-07-22 09:21:06.549 [DEBG] nhrpd: [JT71Y-7VYHQ] Intercepted multicast packet leaving tun100 len 72
2024-07-22 09:21:06.549 [DEBG] nhrpd: [PKEHV-MNXHK] Multicast Packet: 192.168.100.100 -> 192.168.100.13, ret = 72, size = 72, addrlen = 4
2024-07-22 09:21:06.549 [DEBG] nhrpd: [PKEHV-MNXHK] Multicast Packet: 192.168.100.100 -> 192.168.100.111, ret = 72, size = 72, addrlen = 4
2024-07-22 09:21:07.317 [DEBG] nhrpd: [QQ0NK-1H449] Netlink: who-has 10.0.0.13 dev tun100 lladdr 192.168.100.13 nud 0x10 cache used 0 type 4
2024-07-22 09:21:07.317 [DEBG] nhrpd: [QVXNM-NVHEQ] Netlink: update binding for 10.0.0.13 dev tun100 from c (unspec) peer.vc.nbma 192.168.100.13 to lladdr 192.168.100.13
2024-07-22 09:21:07.317 [DEBG] nhrpd: [QQ0NK-1H449] Netlink: new-neigh 10.0.0.13 dev tun100 lladdr 192.168.100.13 nud 0x10 cache used 1 type 4
2024-07-22 09:21:07.317 [DEBG] nhrpd: [QQ0NK-1H449] Netlink: who-has 10.0.0.11 dev tun100 lladdr 192.168.100.111 nud 0x10 cache used 0 type 4
2024-07-22 09:21:07.317 [DEBG] nhrpd: [QVXNM-NVHEQ] Netlink: update binding for 10.0.0.11 dev tun100 from c (unspec) peer.vc.nbma 192.168.100.111 to lladdr 192.168.100.111
2024-07-22 09:21:07.317 [DEBG] nhrpd: [QQ0NK-1H449] Netlink: new-neigh 10.0.0.11 dev tun100 lladdr 192.168.100.111 nud 0x10 cache used 1 type 4

CISCO DMVPN SPOKE debug

*Jul 22 09:36:18.893: NHRP-ATTR: ext_type: 32775, ext_len : 11
*Jul 22 09:36:18.893: NHRP-ATTR: ext_type: 32768, ext_len : 0
*Jul 22 09:36:18.894: NHRP: Receive Traffic Indication via Tunnel100 vrf global(0x0), packet size: 143
*Jul 22 09:36:18.894:  (F) afn: AF_IP(1), type: IP(800), hop: 1, ver: 1
*Jul 22 09:36:18.895:      shtl: 4(NSAP), sstl: 0(NSAP)
*Jul 22 09:36:18.895:      pktsz: 143 extoff: 124
*Jul 22 09:36:18.895:  (M) traffic code: redirect(0)
*Jul 22 09:36:18.895:      src NBMA: 192.168.100.100
*Jul 22 09:36:18.895:      src protocol: 10.0.0.1, dst protocol: 10.0.101.2
*Jul 22 09:36:18.896:      Contents of nhrp traffic indication packet:
*Jul 22 09:36:18.896:         45 00 00 54 28 15 00 00 3E 01 74 90 0A 00 65 02
*Jul 22 09:36:18.896:         0A 00 67 02 00 00 12 E3 15 28 00 01 08 09 0A 0B
*Jul 22 09:36:18.896:         0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B
*Jul 22 09:36:18.896:         1C 1D 1E 1F 20 21 22 23 24 25 26 27 28 29 2A 2B
*Jul 22 09:36:18.897:         2C 2D 2E 2F 30 31 32 33 34 35 36 37 38 39 3A 3B
*Jul 22 09:36:18.897:         3C 3D 3E
*Jul 22 09:36:18.897: Authentication Extension(7):
*Jul 22 09:36:18.897:   type:Cleartext(1), data:test123
*Jul 22 09:36:18.897: NHRP-DETAIL: netid_in = 1, to_us = 0
*Jul 22 09:36:18.898: NHRP: nhrp_rtlookup yielded GigabitEthernet0/1
SPOKE-101#
*Jul 22 09:36:18.898: NHRP-DETAIL: netid_out 0, netid_in 1
*Jul 22 09:36:18.898: NHRP: Parsing NHRP Traffic Indication
*Jul 22 09:36:18.899: NHRP: Enqueued NHRP Resolution Request for destination: 10.0.103.2
*Jul 22 09:36:18.899: NHRP: Checking for delayed event NULL/10.0.103.2 on list (Tunnel100 vrf: global(0x0))
*Jul 22 09:36:18.899: NHRP: No delayed event node found.
SPOKE-101#
*Jul 22 09:36:22.772: NHRP: Checking for delayed event NULL/10.0.103.2 on list (Tunnel100 vrf: global(0x0))
*Jul 22 09:36:22.772: NHRP: No delayed event node found.
*Jul 22 09:36:22.772: NHRP: There is no VPE Extension to construct for the request
*Jul 22 09:36:22.773: NHRP: Sending NHRP Resolution Request for dest: 10.0.103.2 to nexthop: 10.0.103.2 using our src: 10.0.0.11 vrf:global(0x0)
*Jul 22 09:36:22.773: NHRP: Attempting to send packet through interface Tunnel100 via DEST  dst 10.0.103.2
*Jul 22 09:36:22.774: NHRP: Send Resolution Request via Tunnel100 vrf global(0x0), packet size: 87
*Jul 22 09:36:22.774:  src: 10.0.0.11, dst: 10.0.103.2
*Jul 22 09:36:22.775:  (F) afn: AF_IP(1), type: IP(800), hop: 255, ver: 1
*Jul 22 09:36:22.775:      shtl: 4(NSAP), sstl: 0(NSAP)
*Jul 22 09:36:22.775:      pktsz: 87 extoff: 52
*Jul 22 09:36:22.775:  (M) flags: "router auth src-stable nat ", reqid: 7
*Jul 22 09:36:22.775:      src NBMA: 192.168.100.111
*Jul 22 09:36:22.775:      src protocol: 10.0.0.11, dst protocol: 10.0.103.2
*Jul 22 09:36:22.776:  (C-1) code: no error(0)
*Jul 22 09:36:22.776:        prefix: 32, mtu: 17912, hd_time: 450
*Jul 22 09:36:22.776:        addr_len: 0(NSAP), subaddr_len: 0(NSAP), proto_len: 0, pref: 255
*Jul 22 09:36:22.776: Responder Address Extension(3):
SPOKE-101#
*Jul 22 09:36:22.777: Forward Transit NHS Record Extension(4):
*Jul 22 09:36:22.777: Reverse Transit NHS Record Extension(5):
*Jul 22 09:36:22.777: Authentication Extension(7):
*Jul 22 09:36:22.777:   type:Cleartext(1), data:test123
*Jul 22 09:36:22.777: NAT address Extension(9):
*Jul 22 09:36:22.777: NHRP-DETAIL: Unable to get dst from pak sb
*Jul 22 09:36:22.778: NHRP: Encapsulation succeeded.  Sending NHRP Control Packet  NBMA Address: 192.168.100.100
*Jul 22 09:36:22.778: NHRP: 115 bytes out Tunnel100
*Jul 22 09:36:22.957: NHRP-ATTR: ext_type: 32771, ext_len : 0
*Jul 22 09:36:22.957: NHRP-ATTR: ext_type: 32772, ext_len : 20
*Jul 22 09:36:22.957: NHRP-ATTR: ext_type: 32773, ext_len : 0
*Jul 22 09:36:22.958: NHRP-ATTR: ext_type: 32775, ext_len : 0
*Jul 22 09:36:22.958: NHRP-ERROR: Incorrect Auth extn length 0
*Jul 22 09:36:22.959: NHRP: Sending error indication. Reason: 'Pak sanity failure' LINE: 9877
*Jul 22 09:36:22.959: NHRP: Attempting to send packet through interface Tunnel100 via DEST  dst 10.0.0.13
*Jul 22 09:36:22.960: NHRP: Send Error Indication via Tunnel100 vrf global(0x0), packet size: 151
*Jul 22 09:36:22.960:  src: 10.0.0.11, dst: 10.0.0.13
*Jul 22 09:36:22.961:  (F) afn: AF_IP(1), type: IP(800), hop: 255, ver: 1
*Jul 22 09:36:22.961:      shtl: 4(NSAP), sstl: 0(NSAP)
*Jul 22 09:36:22.961:      pktsz: 151 extoff: 0
*Jul 22 09:36:22.961:  (M) error code: protocol generic error(7), offset: 84
*Jul 22 09:36:22.961:      src NBMA: 192.168.100.111
*Jul 22 09:36:22.962:      src protocol: 10.0.0.11, dst protocol: 10.0.0.13
*Jul 22 09:36:22.962:      Contents of error packet:
*Jul 22 09:36:22.962:         00 01 08 00 00 00 00 00 00 FE 00 6F 39 07 00 34
*Jul 22 09:36:22.963:         01 01 04 00 04 04 C8 02 00 00 00 0E C0 A8 64 0D
*Jul 22 09:36:22.963:         0A 00 00 0D 0A 00 65 02
*Jul 22 09:36:22.963:
*Jul 22 09:36:22.963:
*Jul 22 09:36:22.963: NHRP-DETAIL: Unable to get dst from pak sb
*Jul 22
SPOKE-101#09:36:22.963: NHRP: Encapsulation succeeded.  Sending NHRP Control Packet  NBMA Address: 192.168.100.100
*Jul 22 09:36:22.964: NHRP: 179 bytes out Tunnel100
*Jul 22 09:36:22.977: NHRP-ERROR: Packet Recved with 0 Hop counts on Tunnel100.

tcpdump NHRP_RESOLUTION_REQUEST2.dmp

Additional context

No response

Checklist

aapostoliuk commented 1 month ago

@dleroy @volodymyrhuti can you look at it?

volodymyrhuti commented 1 month ago

I will probably have some time on weekends. It doesn't look complicated, but I'm unsure if I still have a good GNS topology.

dleroy commented 1 month ago

I reproduced this yesterday and have a fix. Plan on pushing the fix today.

donaldsharp commented 1 month ago

problem is fixed