facebook / fboss

Facebook Open Switching System Software for controlling network switches.
Other
866 stars 301 forks source link

warmboot removes "unreferenced" routes and then crashes if it comes back #28

Closed jaymzh closed 6 years ago

jaymzh commented 8 years ago

When fboss warmboots, the trident will have routes I had previously added with fboss_route.py but list_routes will not show them. A minute later, FBOSS will remove those routes from the trident as "unreferenced" and then any attempt to add them back will cause FBOSS to crash.

I0113 04:56:06.449673  6445 ThreadManager.tcc:336] ThreadManager::add called with numa == true, but not a NumaThreadManager
I0113 04:56:06.450521  6465 SwSwitch.cpp:578] Updating state: old_gen=1 new_gen=2
E0113 04:56:06.475909  6465 SwSwitch.cpp:232] Unable to dump switch state to /var/facebook/fboss/crash/switch_state
F0113 04:56:06.476253  6465 SwSwitch.cpp:612] error applying state change to hardware: N8facebook5fboss8BcmErrorE: failed to create a route entry for fe80::/64 @ TO_CPU @egress 100002: Invalid parameter
*** Check failure stack trace: ***
    @     0x7fe13bc61778  (unknown)
    @     0x7fe13bc616b2  (unknown)
    @     0x7fe13bc610b4  (unknown)
    @     0x7fe13bc64055  (unknown)
    @          0x115d1d1  facebook::fboss::SwSwitch::applyUpdate()
    @          0x115c2f5  facebook::fboss::SwSwitch::handlePendingUpdates()
    @          0x115bf14  facebook::fboss::SwSwitch::handlePendingUpdatesHelper()
    @     0x7fe13b28d909  folly::EventBase::FunctionRunner::messageAvailable()
    @     0x7fe13b28f76e  folly::NotificationQueue<>::Consumer::consumeMessages()
    @     0x7fe1336bc3dc  (unknown)
    @     0x7fe13b28af8f  folly::EventBase::loopBody()
    @     0x7fe13b28b8e4  folly::EventBase::loopForever()
    @          0x115f46e  facebook::fboss::SwSwitch::threadLoop()
    @          0x115f1b5  _ZZN8facebook5fboss8SwSwitch12startThreadsEvENKUlvE0_clEv
    @          0x1163790  _ZNSt12_Bind_simpleIFZN8facebook5fboss8SwSwitch12startThreadsEvEUlvE0_vEE9_M_invokeIIEEEvSt12_Index_tupleIIXspT_EEE
    @          0x1163624  _ZNSt12_Bind_simpleIFZN8facebook5fboss8SwSwitch12startThreadsEvEUlvE0_vEEclEv
    @          0x1163510  _ZNSt6thread5_ImplISt12_Bind_simpleIFZN8facebook5fboss8SwSwitch12startThreadsEvEUlvE0_vEEE6_M_runEv
    @     0x7fe13566a970  (unknown)
    @     0x7fe13ac7b0a4  start_thread
    @     0x7fe134dda04d  (unknown)
    @              (nil)  (unknown)
capveg commented 7 years ago

This should be fixed/working now. As is the build (!!) - check out https://travis-ci.org/facebook/fboss.

Please let me know if you're still having this problem.