zephinzer / cloudshell

Xterm.js with a Go backend meant for use in containers
MIT License
131 stars 39 forks source link

race between goroutines: panic: sync: negative WaitGroup counter #3

Open gercker opened 2 years ago

gercker commented 2 years ago

Hi zephinzer,

Cool project! I face a race condition, have a suggestion to fix this. Program now and then panics with "panic: sync: negative Waitgroup Counter". This is because there is only once waiter.Add(1) being called, but there are more than one waiter.Done() calls in different goroutines.

In all cases so far, the pingpong gorountine finishes (logs) first, in line 124 is the 1st waiter.Done() being called. Then line 149 also wants to decrement the counter:

WARN[15:07:08]handler_websocket.go xtermjs.GetHandler.func1.4 failed to read from tty: read /dev/ptmx: input/output error connection_uuid="75bd5ed0-1550-11ec-b29c-005056000bac"↲ WARN[15:07:08]handler_websocket.go xtermjs.GetHandler.func1.4 failed to send termination message from tty to xterm.js: use of closed network connection connection_uuid="75bd5ed0-1550-11ec-b29c-005056000bac" panic: sync: negative WaitGroup counter

To fix it, I am using 3 channels (abortChan{1,2,3}), a channel for each of the 3 goroutines. Goroutines are the sender, and close (or could send an describing string) and return when they face a "serious" error, rather than decrement a WaitGroup.

The handler is the receiver, blocks using select for one of the goroutines to close (or to send a short message). There is another channel: stopChan, handler is the sender. Handler closes stopChan when select got a "signal" on one of the abortChans. The (remaining) goroutines do a non-blocking select on this stopChan. And return when they detect the close of stopChan.

The pingpong goroutine always returns this way. The other goroutines in most cases not, as they usually block in connection.ReadMessage or tty.Read. Their shutdown (including close of their abortChan) will be handled by the deferred func.

If you are interested I prepare a pull request.

Gerhard

lechuhuuha commented 1 year ago

Hi Do you still have the pull request ?