FRRouting / frr

The FRRouting Protocol Suite
https://frrouting.org/
Other
3.21k stars 1.24k forks source link

MSDP weird state #11021

Closed ionb42 closed 1 year ago

ionb42 commented 2 years ago

I ran into a weird issue with MSDP where frr is dropping an incoming connection from a configured MSDP peer, claiming it's not actually configured. This is likely difficult to reproduce and will probably go away on an frr restart (planned for this evening), but I figured people more familiar with the code might be able to spot something in the msdp code anyway. Frr is version 8.1, but a quick perusing of the git commit log shows nothing that might be potentially relevant to this issue in newer releases.

How it ended up in this state: I'm not totally sure. This particular peer was added and removed a few times, that's just about the only thing I can think of. Initially it worked, but then the connection stopped getting established. It's possible that at some point the peer was added outside the vrf block, by mistake, but it was removed afterwards. When I enabled msdp events debugging, this is what I got in the log, repeated every 30 seconds.

2022/04/12 16:10:58 PIM: [J2HKR-CMRSJ][EC 184549377] msdp peer connection refused from 10.158.1.41

Removing and re-adding the peer shows up fine in the log, but frr continues to reject the incoming connection from it:

2022/04/12 09:35:37 PIM: [QTG9T-XM2SZ] MSDP peer 10.158.1.41 created
2022/04/12 09:35:37 PIM: [XNNKB-TTS04] MSDP peer 10.158.1.41 state chg to inactive
2022/04/12 09:35:37 PIM: [XNNKB-TTS04] MSDP peer 10.158.1.41 state chg to listen
2022/04/12 09:35:37 PIM: [XNNKB-TTS04] MSDP peer 10.158.1.41 state chg to inactive
2022/04/12 09:35:37 PIM: [XNNKB-TTS04] MSDP peer 10.158.1.41 state chg to listen
2022/04/12 09:35:45 PIM: [J2HKR-CMRSJ][EC 184549377] msdp peer connection refused from 10.158.1.41
2022/04/12 09:36:15 PIM: [J2HKR-CMRSJ][EC 184549377] msdp peer connection refused from 10.158.1.41

This is the state of the msdp peer as shown by frr:

# sh ip msdp vrf mconvrf peer 10.158.1.41
Peer : 10.158.1.41
  Local               : 10.158.1.42
  State               : listen
  Uptime              : -
  Keepalive Timer     : --:--:--
  Conn Retry Timer    : --:--:--
  Hold Timer          : --:--:--
  Last Reset          : -
  Conn Attempts       : 0
  Established Changes : 0
  SA Count            : 0
  Statistics          :
                       Sent       Rcvd
    Keepalives :          0          0
    SAs        :          0          0

There is nothing in the default vrf, either in the running config (see below) or in frr output:

# sh ip msdp peer
Peer                       Local        State    Uptime   SaCnt

The other 3 MSDP peers are working fine.

# sh ip msdp vrf mconvrf peer
Peer                       Local        State    Uptime   SaCnt
10.24.51.221        10.24.51.222  established  1294:33:48    5943
10.24.51.225        10.24.51.226  established  1294:33:24       0
10.158.1.5            10.158.1.6  established  664:31:13      47
10.158.1.41          10.158.1.42       listen         -       0

The relevant frr config fragment is:

frr version 8.1
frr defaults traditional
hostname XXXX
log file /var/log/frr/frr.log
ip multicast rpf-lookup-mode mrib-then-urib
service integrated-vtysh-config
!
debug msdp events
!
vrf mconvrf
 ip msdp peer 10.24.51.221 source 10.24.51.222
 ip msdp peer 10.24.51.225 source 10.24.51.226
 ip msdp peer 10.158.1.5 source 10.158.1.6
 ip msdp peer 10.158.1.41 source 10.158.1.42
 ip pim rp 10.24.51.3 224.0.0.0/4
 ip pim rp 91.215.164.128 233.154.74.0/32
 ip pim rp 91.215.164.128 233.154.74.8/32
 ip pim rp 74.115.128.164 224.0.131.132/32
 ip pim rp 66.54.251.236 224.0.32.0/24
 ip pim rp 205.209.215.254 224.0.31.0/24
exit-vrf

There are BGP peerings with the same peers, all of which are working:

# sh ip bgp vrf mconvrf sum

IPv4 Unicast Summary (VRF mconvrf):
BGP router identifier 10.24.51.228, local AS number 65315 vrf-id 6
BGP table version 320
RIB entries 34, using 6256 bytes of memory
Peers 4, using 2892 KiB of memory
Peer groups 1, using 64 bytes of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
10.24.51.1      4      65315     77765     77792        0    0    0 07w4d22h           12        7 
10.24.51.2      4      65315     77778     77799        0    0    0 07w4d22h           12        7 
10.158.1.5      4      64583     77680     77767        0    0    0 03w6d16h            1       11 
10.158.1.41     4      65329     77278     77382        0    0    0 05:33:07            3       11 

Total number of neighbors 4

The local 10.158.1.42 interface is properly in the mconvrf vrf:

# ip -d addr show dev mcgreuswe1
32: mcgreuswe1@NONE: <POINTOPOINT,MULTICAST,NOARP,ALLMULTI,UP,LOWER_UP> mtu 1468 qdisc noqueue master mconvrf state UNKNOWN group default qlen 1000
    link/gre 10.24.51.228 peer 10.145.48.15 promiscuity 0 
    gre remote 10.145.48.15 local 10.24.51.228 ttl 64 encap fou encap-sport auto encap-dport 4754 noencap-csum noencap-csum6 noencap-remcsum 
    vrf_slave table 10 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 
    inet 10.158.1.42/30 scope global mcgreuswe1
       valid_lft forever preferred_lft forever

This is what I see in tcpdump:

# tcpdump -n -p -i mcgreuswe1 port msdp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on mcgreuswe1, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
16:37:35.012353 IP 10.158.1.41.43953 > 10.158.1.42.msdp: Flags [S], seq 2826309303, win 64260, options [mss 1428,sackOK,TS val 2129474871 ecr 0,nop,wscale 11], length 0
16:37:35.012372 IP 10.158.1.42.msdp > 10.158.1.41.43953: Flags [S.], seq 4063148055, ack 2826309304, win 65136, options [mss 1428,sackOK,TS val 378606635 ecr 2129474871,nop,wscale 11], length 0
16:37:35.073063 IP 10.158.1.41.43953 > 10.158.1.42.msdp: Flags [.], ack 1, win 32, options [nop,nop,TS val 2129474932 ecr 378606635], length 0
16:37:35.073113 IP 10.158.1.42.msdp > 10.158.1.41.43953: Flags [F.], seq 1, ack 1, win 32, options [nop,nop,TS val 378606696 ecr 2129474932], length 0
16:37:35.073131 IP 10.158.1.41.43953 > 10.158.1.42.msdp: Flags [P.], seq 1:4, ack 1, win 32, options [nop,nop,TS val 2129474932 ecr 378606635], length 3 msdp: Keepalive
16:37:35.073143 IP 10.158.1.42.msdp > 10.158.1.41.43953: Flags [R], seq 4063148056, win 0, length 0
16:37:35.134768 IP 10.158.1.41.43953 > 10.158.1.42.msdp: Flags [R], seq 2826309304, win 0, length 0

So the connection is established and then dropped right away by 10.158.1.42, before any data is exchanged.

[X] Did you check if this is a duplicate issue? [ ] Did you test it on the latest FRRouting/frr master branch?

The OS is Linux (CentOS7) with a local custom built kernel (vanilla 5.4.49).

ionb42 commented 2 years ago

I can confirm that restarting frr on 10.158.1.42, without changing anything else, resolved the issue.

ionb42 commented 2 years ago

I suspect I have an explanation for the above, after having thought about it some more. It revolves around net.ipv4.tcp_l3mdev_accept = 1, which is set on the machine running frr as per the recommendation from an older frr manual (v7.1, I believe).

Anyway, the theory is as follows: this particular frr instance does everything inside a non-default VRF, but at some point somebody accidentally configured (and then removed) an MSDP peer in the default VRF. That created an MSDP socket in the default VRF, which likely wasn't closed when the MSDP peer was removed. With tcp_l3mdev_accept=1, both MSDP sockets (the one in the default VRF and the one in the non-default VRF) could accept an incoming MSDP connection, but if the one in the default VRF got it first, it would then reject it because the default VRF had no MSDP peers configured.

I verified this on a test instance. After adding an MSDP peer in listening mode (source IP greater than peer IP) in the non-default VRF, a listening MSDP socket showed up in 'ss -tnap' in that VRF:

LISTEN     0      3      *%mconvrf:639                      *:*                   users:(("pimd",pid=15250,fd=40))

The socket did not go away as I removed that listening peer. Then I added a listening MSDP peer in the default VRF and sure enough a second listening MSDP showed up, in the default VRF:

LISTEN     0      3            *:639                      *:*                   users:(("pimd",pid=15250,fd=41))
LISTEN     0      3      *%mconvrf:639                      *:*                   users:(("pimd",pid=15250,fd=40))

Again, the listening socket did not go away after removing the listening peer.

Arguably tcp_l3mdev_accept shouldn't be set anyway, so that's what we're going to do on our side. Whether the MSDP socket should linger around when the last listening peer is removed is a question I leave to people who know the frr code better. :-)

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 180 days with no activity. Comment or remove the autoclose label in order to avoid having this issue closed.

frrbot[bot] commented 1 year ago

This issue will be automatically closed in the specified period unless there is further activity.