vishvananda / netns

Simple network namespace handling for go.
Apache License 2.0
375 stars 133 forks source link

Access to namespace specific procfs entries fails occasionally with multiple goroutines switching NS #70

Open sandeshrk opened 1 year ago

sandeshrk commented 1 year ago

Hi,

We have 2 net namespaces. Let's call it n1 and n2. The golang program is started in n2. n1 has bond interface and we monitor the status of the bond interfaces from the program by switching to namespace n1 and reading /proc/net/bonding/bond0 (os.Read*). We follow the sequence of operations listed in the example (https://github.com/vishvananda/netns) to switch namespaces and read the file. This works just fine with a single goroutine. If we have 3-4 goroutines doing the same sequence of operations, occasionally 1 goroutine will fail to read the bond0 file with error "open /proc/net/bonding/bond0: no such file or directory". If we have 8 goroutines multiple goroutines report the same failure. I have confirmed that after the ns switch to n1 /proc//task//ns/net the entry points to n1 for the goroutine. I have also confirmed that each of these goroutines are on their own thread (syscall.Gettid()). If instead of using os.Read, I use exec.Command to copy the content to a different file, I have not seen any access issues with any number of goroutines. What could be going on here? My intent is not to read the procfs from multiple goroutines. This was just a way to recreate the issue easily. We can have 100+ goroutines each switching namespaces for different activities and we encounter procfs access issues on one of the goroutines tasked with reading the status.

Thanks!

amurchick commented 1 year ago

Same issue, golang 1.19.5, I am use runtime.LockOSThread()/runtime.UnlockOSThread() - but it does not helps (