However, its memory still referenced in eigrp->topology_table .. pe->entries .. entry->adv_router. This results in garbage being used as gateway (entry->adv_router->src), generating the following error:
ZEBRA: Extended Error: Nexthop has invalid gateway
It can be easily seen in Valgrind:
Run eigrpd with Valgrind on one of the nodes.
Stop eigrpd process of one of its neighbors.
After timeout you will see the following message:
==1144== Conditional jump or move depends on uninitialised value(s)
==1144== at 0x115811: eigrp_metrics_is_same (in /usr/local/sbin/eigrpd)
==1144== by 0x118198: eigrp_topology_update_distance (in /usr/local/sbin/eigrpd)
==1144== by 0x11C4A8: ??? (in /usr/local/sbin/eigrpd)
==1144== by 0x11CB3A: eigrp_fsm_event (in /usr/local/sbin/eigrpd)
==1144== by 0x1184A0: eigrp_topology_neighbor_down (in /usr/local/sbin/eigrpd)
==1144== by 0x114CE4: eigrp_nbr_delete (in /usr/local/sbin/eigrpd)
==1144== by 0x114D8D: holddown_timer_expired (in /usr/local/sbin/eigrpd)
==1144== by 0x48A9191: thread_call (in /usr/local/lib/libfrr.so.0.0.0)
==1144== by 0x4889DA2: frr_run (in /usr/local/lib/libfrr.so.0.0.0)
==1144== by 0x111E67: main (in /usr/local/sbin/eigrpd)
Then connect with vty to the node running in Valgrind and enter the command: show ip eigrp topo. You will see the following Valgrind error:
==1144== Invalid read of size 4
==1144== at 0x113344: show_ip_eigrp_nexthop_entry (in /usr/local/sbin/eigrpd)
==1144== by 0x119B02: ??? (in /usr/local/sbin/eigrpd)
==1144== by 0x487555F: ??? (in /usr/local/lib/libfrr.so.0.0.0)
==1144== by 0x48775D7: cmd_execute_command (in /usr/local/lib/libfrr.so.0.0.0)
==1144== by 0x4877706: cmd_execute (in /usr/local/lib/libfrr.so.0.0.0)
==1144== by 0x48AC00D: ??? (in /usr/local/lib/libfrr.so.0.0.0)
==1144== by 0x48AD5F0: ??? (in /usr/local/lib/libfrr.so.0.0.0)
==1144== by 0x48A9191: thread_call (in /usr/local/lib/libfrr.so.0.0.0)
==1144== by 0x4889DA2: frr_run (in /usr/local/lib/libfrr.so.0.0.0)
==1144== by 0x111E67: main (in /usr/local/sbin/eigrpd)
==1144== Address 0x5e33a88 is 24 bytes inside a block of size 104 free'd
==1144== at 0x48369AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1144== by 0x114D8D: holddown_timer_expired (in /usr/local/sbin/eigrpd)
==1144== by 0x48A9191: thread_call (in /usr/local/lib/libfrr.so.0.0.0)
==1144== by 0x4889DA2: frr_run (in /usr/local/lib/libfrr.so.0.0.0)
==1144== by 0x111E67: main (in /usr/local/sbin/eigrpd)
==1144== Block was alloc'd at
==1144== at 0x4837B65: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1144== by 0x488CEFA: qcalloc (in /usr/local/lib/libfrr.so.0.0.0)
==1144== by 0x114B42: eigrp_nbr_new (in /usr/local/sbin/eigrpd)
==1144== by 0x114BC1: eigrp_nbr_get (in /usr/local/sbin/eigrpd)
==1144== by 0x113F27: eigrp_hello_receive (in /usr/local/sbin/eigrpd)
==1144== by 0x11681D: eigrp_read (in /usr/local/sbin/eigrpd)
==1144== by 0x48A9191: thread_call (in /usr/local/lib/libfrr.so.0.0.0)
==1144== by 0x4889DA2: frr_run (in /usr/local/lib/libfrr.so.0.0.0)
==1144== by 0x111E67: main (in /usr/local/sbin/eigrpd)
Another way to see the bug is to:
Configure routers with variance > 1.
Raise delay on eigrpd process being monitored by Valgrind using vty.
You should see Valgrind errors and sometimes Nexthop has invalid gateway errors.
Generally, any function calling eigrp_nbr_delete leads to use after free.
Commenting out line:
XFREE(MTYPE_EIGRP_NEIGHBOR, nbr);
fixes the use after free, but also results in dead neighbors not being evicted from topologies and forwarding tables.
Function
eigrp_nbr_delete
frees thenbr
structure:https://github.com/FRRouting/frr/blob/2fa3198399df58b1b039f094ad03d7c1329a2a3d/eigrpd/eigrp_neighbor.c#L177-L193
However, its memory still referenced in
eigrp->topology_table .. pe->entries .. entry->adv_router
. This results in garbage being used as gateway (entry->adv_router->src
), generating the following error:It can be easily seen in Valgrind:
Run eigrpd with Valgrind on one of the nodes.
Stop eigrpd process of one of its neighbors.
After timeout you will see the following message:
Then connect with vty to the node running in Valgrind and enter the command:
show ip eigrp topo
. You will see the following Valgrind error:Another way to see the bug is to:
variance > 1
.Nexthop has invalid gateway
errors.Generally, any function calling
eigrp_nbr_delete
leads to use after free.Commenting out line:
fixes the use after free, but also results in dead neighbors not being evicted from topologies and forwarding tables.