rakelkar / gonetsh

GO wrapper for the windows NETSH tool
Apache License 2.0
11 stars 11 forks source link

Fix deadlock in go-powershell #11

Closed benmoss closed 5 years ago

benmoss commented 5 years ago

This is a fix for a bug we're seeing in Flannel, where occasionally we would see a cluster come up and pods on one of the nodes would be failing to reach pods on other nodes. We traced it to missing route entries, which we then traced to stuck goroutines in go-powershell.

What we have seen in our testing is that it the bug is pretty non-deterministic, which is why the TestLeakyShells test runs 1000 times. The cause seems to be that sometimes Powershell commandlets run by go-powershell come back with extra \r\ns at the end of them. The trick go-powershell uses right now relies on the boundary plus one \r\n to be the final thing that stdout or stderr produces. With the extra newlines randomly inserted, the library misses the boundary and the streamReader func never finishes, leaving the goroutine deadlocked. This then causes non-deterministic failures in the Kubernetes cluster, like having some routes never show up.

The maintainer of bhendo/go-powershell has also said he does not plan to maintain it, so I forked it as well rather than trying to get a fix merged there.

benmoss commented 5 years ago

I'm not super thrilled with this solution, I never could get to the bottom of why these random newlines were appearing, so if you have any bandwidth to help with this I'd appreciate it :)