Closed BorisMulder-CSL closed 5 years ago
I adapted the issue title as nic_router also does not call Mac_allocator:free()
.
@chelmuth Thanks that you have pointed out this significant issue in the NIC router!
I'll try to fix both components these days.
Fixed: ceab90e3d6 nic_router/nic_bridge: free MAC addresses
The commit also adds a new test component named "nic_stress". In contrast to "net_flood" (which I'd like to rename "net_stress" in the future) the "nic_stress" component aims for low-level NIC interactions without considering network protocols while the "net_stress" component aims for the corner cases in the field of common network protocols. Currently the "nic_stress" component only tests the creation and destruction of loads of NIC sessions.
There are two new tests "nic_bridge_stress" and "nic_router_stress" that are added to the autopilot list.
Here's a fix-up that fixes the test author and moves the run scripts to os: 8a63ec49a6 Fixup "nic_router/nic_bridge: free MAC addresses" (author, test repo)
When removing that MAC addresses get freed in the NIC router, the new test also reveals a fault in the NIC router as soon as the limit of MAC addresses is reached. But this should be handled in a dedicated issue.
Had to rebase: 60935f9a5f Fixup "nic_router/nic_bridge: free MAC addresses" (author, test repo) e3a6eec00a nic_router/nic_bridge: free MAC addresses
WIth the two new *_stress autopilot test we also got 15 new test timeouts last night. @m-stein what is your suggestion to reduce the noise? Please keep in mind that the total amount of failing nic tests was 33 last night.
I've spend some hours debugging the nic_router_stress test on several platforms. One problem is the destruction of an undissolved signal context during the constructor of the Packet_stream_source while creating a new NIC session. But I haven't found a fix for this so far.
@nfeske These two should fix the nic_*_stress issues: 28770f1eac Fixup "nic_router/nic_bridge: free MAC addresses" (nic_router_stress: fix sel4/foc/fiasco) 5e375ec803 Fixup "nic_router/nic_bridge: free MAC addresses" (nic_stress: handle exception)
@nfeske I forgot this one: 0fdb30d181 Fixup "nic_router/nic_bridge: free MAC addresses" (nic_bridge_stress: fix sel4/foc/fiasco)
Thanks a lot @m-stein! I merged the 3 fixups to staging.
Only on fiasco+x86_32+hardware nic_bridge_stress is still failing:
[2019-03-27 05:22:28] [init -> nic_stress_1] round 22/22 nic 10/11 mac 02:02:02:02:42:08
[2019-03-27 05:22:28] [init -> nic_stress_1] round 22/22 nic 11/11 mac 02:02:02:02:42:09
[2019-03-27 05:22:28] [init -> nic_stress_2] round 16/16 nic 1/16 mac 02:02:02:02:42:00
[2019-03-27 05:22:28] [init -> nic_stress_1] --- finished NIC stress test ---
[2019-03-27 05:22:28] [init] child "nic_stress_1" exited with exit value 0
[2019-03-27 05:22:28] [init] Error: ipc_reply_and_wait error 0x10
[2019-03-27 05:22:28] [init -> nic_stress_2] round 16/16 nic 2/16 mac 02:02:02:02:42:01
[2019-03-27 05:27:21] Error: Test execution timed out
I guess it's something with parent.exit()
. If so, I plan to simply circumvent the call of parent.exit()
on fiasco.
@nfeske This fix should solve the above mentioned problem: c3f0f522de Fixup "nic_router/nic_bridge: free MAC addresses" (nic_bridge_stress: fix fiasco ipc error)
Thanks for the fixup, which I merged to staging just now. Quirks like this are unfortunate but now this special case is documented in the run script, which is nice. The code de-duplication is all the better.
:-)
Whenever a component is started and stopped that uses a nic session from a nic_bridge, a new mac address is assigned each time that component creates its session, even when that component has a policy in nic_bridge for a static ip. This pool of mac adresses is never released and after 2^8 allocations the nic_bridge will fail and not give any other sessions.
This can be verified by running this script that starts and stops fetchurl for a while: init_fetchurl.run.txt