Open Jrenk opened 1 year ago
The stack trace points to the sendMessage
function. As I already mentioned this is the only place where the WriteJSON
or any other Write
of the gorilla package gets called. So there is no possibility of the application calling WriteJSON
without calling mutex.Lock()
beforehand. I don't think this issue should be closed.
@Jrenk hello, could you provide a reproducing example/test/benchmark then to make it clear that it's an issue of this package?
@Jrenk I'm running into a similar situation. I have a loop in a goroutine that writes to an incoming socket connection for the heartbeat response. Before each write, I lock a common mutex and unlock after the write. The issue seems like it may be tied to a race condition as the issue doesn't happen consistently. Did you figure out what was causing your issue?
@NomNomCameron No I didn't figure it out. The problem happened very rarely and it didn't occur for a very long time. I'm still not sure what causes this sadly.
@Jrenk
Could it be someone pinged your websocket server and the default ping handler tried to write a pong while it was locked?
edit:I might test this later today, we'll see.
I'm imagining a pull request where configuring the websockets server would ask if you want to stick a lock or mutex or something a couple layers deep. Any thoughts on that?
@ayjayt that could actually be the problem. Let me know if you tested this. A lock in the package itself could resolve this as you said.
Yeah, I mean it has to be it. I might test it, but you should use
Conn.SetPingHandler(func(string) error)
anyway to protect against race conditions.
Here is what the default handler looks like:
h = func(message string) error {
err := c.WriteControl(PongMessage, []byte(message), time.Now().Add(writeWait))
if err == ErrCloseSent {
return nil
} else if e, ok := err.(net.Error); ok && e.Temporary() {
return nil
}
return err
}
from: https://github.com/gorilla/websocket/blob/9111bb834a68b893cebbbaed5060bdbc1d9ab7d2/conn.go#L1158
@coreydaley is this not a bug? If I'm correct, anyone who uses any *Write*()
function at anytime might collide with the WriteControl()
called by the default handler and produce a concurrent write error. I am not confident in this!
edit: especially since most people are going to run their read loops and write loops in separate go routines, and the default handler is actually called automatically by NextReaders()
's advanceFrame()
call
From Gorilla WebSocket docs:
Connections support one concurrent reader and one concurrent writer.
Applications are responsible for ensuring that no more than one goroutine calls the write methods (NextWriter, SetWriteDeadline, WriteMessage, WriteJSON, EnableWriteCompression, SetCompressionLevel) concurrently and that no more than one goroutine calls the read methods (NextReader, SetReadDeadline, ReadMessage, ReadJSON, SetPongHandler, SetPingHandler) concurrently.
The Close and WriteControl methods can be called concurrently with all other methods.
As I mentioned above - if you think it's a bug in a package you better provide a reproducer instead of guessing. You can use https://pkg.go.dev/golang.org/x/tools/cmd/stress tool to increase the chance of finding the reliable reproducer.
if you think it's a bug in a package you better provide a reproducer instead of guessing.
If you want me to spend my time working on this project, and building something to reproduce this bug, you better be nice to me!
We're having a discussion before spending time writing code.
[contributors note: redaction was not needed]
anyone who uses any Write() function at anytime might collide with the WriteControl() called by the default handler and produce a concurrent write error.
The websocket connection uses a mutex (implemented using a channel) to prevent WriteControl
from colliding with the *Write*
methods. The websocket connection write methods acquire the mutex before executing any write related methods on the network connection (SetWriteDeadline and Write).
(Why the websocket connection uses a channel instead of a sync.Mutex is a puzzle for another day.)
Please keep all comments and conversations civil, please keep in mind that Github users are from all over the world and speak hundred of different languages, and their native languages are not english, but they are being gracious enough to use english here. So do not assume that anything is being said with a specific tone or meaning. Disrespectful language will not be tolerated in this community.
We are almost done with the project transition and then we will begin triaging and working on issues, but we are not quite to that point yet. If you think that you have found a bug and would like to submit a pull request, please do so and we will review it as part of triaging/addressing this issue.
If I'm going to be chastised for asking someone to speak respectfully to me, I will simply not participate here further.
If I'm going to be chastised for asking someone to speak respectfully to me, I will simply not participate here further.
That is your prerogative. This is not the first comment within your short time of contributing to this project that could be deemed disrespectful
to someone who has disagreed with you or questioned you: https://github.com/gorilla/websocket/issues/841#issuecomment-1644901325
As far as I am concerned that is two strikes. One more and we will have to discuss blocking you from interacting with the organziation.
@e9x That sort of comment will not be tolerated. I sincerely hope that you are able to find help that allows you to be a more civil member of society.
Not that it will do any good but I reported the comment from @e9x, not sure what I did to elicit such a comment.
Alright, I'm very tired, but I think I've reviewed this ticket enough to come to the conclusion that this should technically be considered a bug. While the documentation does specify that it supports only one reader/writer doing so concurrently at a time, it doesn't make it clear that if you're doing your own concurrent writes you'd also have to overwrite the ping handler.
In this case the appropriate solution is likely to document the need for the ping handler being specified if performing your own concurrent writing and to have this resolved in what ever future version of this library.
if you're doing your own concurrent writes you'd also have to overwrite the ping handler.
That's not true. WriteControl, the method used by the ping handler, can be called concurrently with the other write methods. The connection is implemented to support the concurrent call. The documentation states that the concurrent call is allowed.
The concurrent write to websocket connection
panic cannot be triggered by the WriteControl method because the WriteControl method does not directly or indirectly call the flushFrame method where the concurrency detector is implemented. The concurrent write to websocket connection
panic is caused by concurrent application calls to WriteMessage, WriteJSON and NextWriter, not the ping handler.
Edit: Previous comments (1, 2) also explain why the ping handler is not a problem.
Describe the problem you're having
Using the gorilla/websocket package I sometimes get the following error, despite using mutex.Lock:
concurrent write to websocket connection
Versions
1.18
v1.5.0
"Show me the code!"
In an http handler the websocket clients get created:
The
sendMessage
function gets called from multiple go-routines and theWriteJSON
never gets called directly anywhere else. From my understanding themutex.Lock()
should prevent concurrent websocket writes.