FRRouting / frr

The FRRouting Protocol Suite
https://frrouting.org/
Other
3.29k stars 1.24k forks source link

Empty OSPF routing table (Network Type Null issue?) #4178

Open piotrjurkiewicz opened 5 years ago

piotrjurkiewicz commented 5 years ago

I noticed that OSPF daemon sometimes does not generate routing table, despite having populated network database, fully established neighbors, etc.

The bug appears randomly and not with all topologies. For example, in case of nobel_eu, it occurs approximately in 1/20 network startups, on a random node. It can also be triggered by link cost changes during network operation. On the other hand, in case of polska network, it does not happen at all.

I observe it at least since the January. Using master on Linux 4.19 in Mininet.

I have dumped several information from ospfd which was hit by the bug: Zagreb.txt

First, the ospfd has all LSA and fully adjacent neigbors:

 Area ID: 0.0.0.0 (Backbone)
   Number of interfaces in this area: Total: 3, Active: 3
   Number of fully adjacent neighbors in this area: 3
   Area has no authentication
   SPF algorithm executed 16 times
   Number of LSA 69
   Number of router LSA 28. Checksum Sum 0x000dff75
   Number of network LSA 41. Checksum Sum 0x0012f00c

However, routing tables are empty:

Zagreb# show ip ospf route
============ OSPF network routing table ============

============ OSPF router routing table =============

============ OSPF external routing table ===========

Link count in database for this router entry is zero:

Zagreb# show ip ospf database

       OSPF Router with ID (10.127.0.38)

                Router Link States (Area 0.0.0.0)

Link ID         ADV Router      Age  Seq#       CkSum  Link count
10.127.0.1      10.127.0.1       355 0x80000012 0x92bd 4
10.127.0.2      10.127.0.2       366 0x8000000e 0xa7ef 3
10.127.0.6      10.127.0.6       407 0x8000000b 0x50d1 2
10.127.0.10     10.127.0.10      394 0x8000000d 0x3444 3
10.127.0.14     10.127.0.14      376 0x8000000e 0xba34 3
10.127.0.17     10.127.0.17      392 0x8000000c 0xdf9c 2
10.127.0.18     10.127.0.18      361 0x8000000e 0x37b0 3
10.127.0.22     10.127.0.22      350 0x8000000f 0x9b2e 4
10.127.0.25     10.127.0.25      405 0x8000000b 0xa3a9 2
10.127.0.26     10.127.0.26      303 0x8000000f 0x8ba1 3
10.127.0.30     10.127.0.30      385 0x8000000a 0xcd2d 2
10.127.0.34     10.127.0.34      362 0x8000000e 0x6c8a 3
10.127.0.38     10.127.0.38      419 0x8000000e 0x30c6 0  <----- HERE

Examining interfaces details shows an alarming thing:

Zagreb# show ip ospf interface
eth1 is up
  ifindex 2, MTU 1500 bytes, BW 10 Mbit <UP,BROADCAST,RUNNING,PROMISC,MULTICAST>
  Internet Address 10.127.0.38/30, Broadcast 10.127.0.39, Area 0.0.0.0
  MTU mismatch detection: enabled
  Router ID 10.127.0.38, Network Type Null, Cost: 10
  Transmit Delay is 1 sec, State DR, Priority 1
  Backup Designated Router (ID) 10.127.0.18, Interface Address 10.127.0.37
  Saved Network-LSA sequence number 0x80000005
  Multicast group memberships: OSPFAllRouters
  Timer intervals configured, Hello 10s, Dead 40s, Wait 40s, Retransmit 5
    Hello due in 8.153s
  Neighbor Count is 1, Adjacent neighbor count is 1
eth2 is up
  ifindex 4, MTU 1500 bytes, BW 10 Mbit <UP,BROADCAST,RUNNING,PROMISC,MULTICAST>
  Internet Address 10.127.0.146/30, Broadcast 10.127.0.147, Area 0.0.0.0
  MTU mismatch detection: enabled
  Router ID 10.127.0.38, Network Type Null, Cost: 10
  Transmit Delay is 1 sec, State DR, Priority 1
  Backup Designated Router (ID) 10.127.0.22, Interface Address 10.127.0.145
  Saved Network-LSA sequence number 0x80000005
  Multicast group memberships: OSPFAllRouters
  Timer intervals configured, Hello 10s, Dead 40s, Wait 40s, Retransmit 5
    Hello due in 8.153s
  Neighbor Count is 1, Adjacent neighbor count is 1
eth3 is up
  ifindex 6, MTU 1500 bytes, BW 10 Mbit <UP,BROADCAST,RUNNING,PROMISC,MULTICAST>
  Internet Address 10.127.0.158/30, Broadcast 10.127.0.159, Area 0.0.0.0
  MTU mismatch detection: enabled
  Router ID 10.127.0.38, Network Type Null, Cost: 10
  Transmit Delay is 1 sec, State Backup, Priority 1
  Backup Designated Router (ID) 10.127.0.38, Interface Address 10.127.0.158
  Multicast group memberships: OSPFAllRouters
  Timer intervals configured, Hello 10s, Dead 40s, Wait 40s, Retransmit 5
    Hello due in 8.153s
  Neighbor Count is 1, Adjacent neighbor count is 1

All interfaces have Network Type Null. I started to wonder what if I hard code network type in configuration. I changed ospfd.conf from:

interface eth1
 ip ospf cost 10
 ip ospf area 0
!
interface eth2
 ip ospf cost 10
 ip ospf area 0
!
interface eth3
 ip ospf cost 10
 ip ospf area 0
!
router ospf
 ospf router-id 10.127.0.38

to:

interface eth1
 ip ospf cost 10
 ip ospf area 0
 ip ospf network broadcast
!
interface eth2
 ip ospf cost 10
 ip ospf area 0
 ip ospf network broadcast
!
interface eth3
 ip ospf cost 10
 ip ospf area 0
 ip ospf network broadcast
!
router ospf
 ospf router-id 10.127.0.38

Surprisingly, this solved the problem! Routing table is always generated, both on startup and after cost changes. However, specifying network type should not be required. OSPFD should by default assume BROADCAST network type when not specified. Apparently there is a problem with initialization. Default value is not being set and the network type field stays zeroed with invalid (null) value.

ton31337 commented 1 year ago

@piotrjurkiewicz lots of changes since that time came in, could you verify in the latest releases?

marekr87 commented 4 months ago

@piotrjurkiewicz lots of changes since that time came in, could you verify in the latest releases?

Hello, unfortunately, we have the same problem on the 9.1 version but the only difference between our and Piotr's configuration is that we had broadcast network type since the beginning.