Open dvyukov opened 2 years ago
From Eric Dumazet:
Part of the cost comes from inet_twsk_purge(), scanning half million slot TCP ehash table.
I attempted to get rid of it this back in January, but had to revert this in May, (4bbdb2b00bb2f23a2)
Good news is that soon, each netns will be able to use its own (smaller) hash table, of arbitrary size
[https://patchwork.kernel.org/project/netdevbpf/list/?series=675049](https://www.google.com/url?q=https://patchwork.kernel.org/project/netdevbpf/list/?series%3D675049&sa=D&source=buganizer&usg=AOvVaw0Xps7VAn87WlZaJ_JaAkKV)
Relevant (and last of the series) patch
[https://patchwork.kernel.org/project/netdevbpf/patch/20220908011022.45342-7-kuniyu@amazon.com/](https://www.google.com/url?q=https://patchwork.kernel.org/project/netdevbpf/patch/20220908011022.45342-7-kuniyu@amazon.com/&sa=D&source=buganizer&usg=AOvVaw1SDUzKQduS_aTO6rAF3qSe)
So syzbot will be able to use a much smaller per net-ns hash, making netns dismantle faster ?
Although I am not so sure, because currently inet_twsk_purge() is called for a batch of dismantling netns.
# sysctl -w net.ipv4.tcp_child_ehash_entries=128
net.ipv4.tcp_child_ehash_entries = 128
unshare ...
Currently network namespaces are shared across multiple tests. This was done for performance reasons. Creation/destruction of net ns is slow + it's globally serialized thus parallel test processes suffer + destruction is asynchronous and we can create them faster than they are destroyed and deplete system memory.
However, shared namespaces lead to some issues:
We should re-consider use of net namespace per test. There were some performance improvements in this area (batched destruction) + we can try to control number of pending to-be-destroyed namespaces with
/proc/sys/user/max_net_namespaces
. It's decremented at the very end of destruction thus account for pending namespaces.However, it still seems to be significantly slow. Here is a quick benchmarking test: https://github.com/dvyukov/syzkaller/commit/5c6ebedf93986ba0f99b8491cfa6665fbb1b6fe9 https://gist.github.com/dvyukov/02cacfe2b5aa98e1f42ed67f31dff5b7 (currently it also shows unbounded memory growth due to pending namespaces) On our syzbot kernel in qemu with 2 CPUs it shows not very promising numbers (also shows that max_net_namespaces works):