olahol / melody

:notes: Minimalist websocket framework for Go
BSD 2-Clause "Simplified" License
3.76k stars 365 forks source link

Fix concurrent panic #67

Closed lesismal closed 2 years ago

lesismal commented 3 years ago

if two goroutines exec these two lines nearly the same time: https://github.com/olahol/melody/blob/master/session.go#L59 https://github.com/olahol/melody/blob/master/session.go#L24

the two goroutines will get the same opening state, and if close(s.output) exec first https://github.com/olahol/melody/blob/master/session.go#L63

case s.output <- message will panic: https://github.com/olahol/melody/blob/master/session.go#L30

try this example to reproduce this bug:

package main

import (
    "log"
    "sync"
    "time"

    "github.com/gin-gonic/gin"
    "github.com/gorilla/websocket"
    "github.com/olahol/melody"
)

func client(wg *sync.WaitGroup) {
    defer wg.Done()
    c, _, err := websocket.DefaultDialer.Dial("ws://localhost:5000/ws", nil)
    if err != nil {
        log.Fatal("dial:", err)
    }
    defer c.Close()

    text := "test"
    err = c.WriteMessage(websocket.TextMessage, []byte(text))
    if err != nil {
        log.Printf("write: %v", err)
    }

    // try to trigger that melody Session at the server side getting the same opening state before close(s.input)
    time.AfterFunc(time.Second*2, func() {
        c.Close()
    })

    for {
        _, _, err := c.ReadMessage()
        if err != nil {
            return
        }
    }
}

func main() {
    r := gin.Default()
    m := melody.New()

    r.GET("/ws", func(c *gin.Context) {
        m.HandleRequest(c.Writer, c.Request)
    })

    m.HandleMessage(func(s *melody.Session, msg []byte) {
        go func() {
            // try to trigger that getting the same opening state before s.input<-
            for s.Write(msg) == nil {
            }
        }()
    })

    go func() {
        for {
            wg := &sync.WaitGroup{}
            for i := 0; i < 20; i++ {
                wg.Add(1)
                go client(wg)
            }
            wg.Wait()
        }
    }()

    r.Run("localhost:5000")
}

then wait for enough time, we will get:

panic: send on closed channel

goroutine 2105 [running]:
github.com/olahol/melody.(*Session).writeMessage(0xc000483500, 0xc000ba55f0)
        somedir/src/github.com/olahol/melody/session.go:30 +0x6c
github.com/olahol/melody.(*Session).Write(0xc000483500, 0xc000714400, 0x4, 0x200, 0x0, 0x0)
        somedir/src/github.com/olahol/melody/session.go:149 +0x90
main.main.func2.1(0xc000483500, 0xc000714400, 0x4, 0x200)
        somedir/src/github.com/olahol/melody/examples/chat/main.go:50 +0x50
created by main.main.func2
        somedir/src/github.com/olahol/melody/examples/chat/main.go:48 +0x65
iwctwbai commented 2 years ago

Why remove the pointer of sync.RWMutex. Does this not go wrong in some cases, such as when using a shallow copy?

lesismal commented 2 years ago

I only wanted to pr this commit: https://github.com/lesismal/melody/commit/e21886f3a7b601a5509274fb14dcc66d859304a8 but when I committed some new on my branch, it was auto appended to this pr by github, I just reverted the latest and leave only https://github.com/lesismal/melody/commit/e21886f3a7b601a5509274fb14dcc66d859304a8 here.

lesismal commented 2 years ago

Why remove the pointer of sync.RWMutex. Does this not go wrong in some cases, such as when using a shallow copy?

It's not removing the pointer of sync.RWMutex.

Plz see the 1st floor, the close() doesn't guard the process of changing the conn's state and closing the output chan, some other goroutine may push data to the output chan after the chan has been closed.

And run the testing code, you will reproduce the panic.

iwctwbai commented 2 years ago

@lesismal The problem you are referring to does exist and you fixed it.

But I mean here https://github.com/olahol/melody/commit/c116958ce5fce3fc54e6e1f282353d294df0d547#diff-3d4a6b7d54b5107bc5694b7a06383f9ec4c68f9abe43ba0942f5d970f3a8ccacL21

lesismal commented 2 years ago

I only wanted to pr this commit: lesismal@e21886f but when I committed some new on my branch, it was auto appended to this pr by github, I just reverted the latest and leave only lesismal@e21886f here.

@iwctwbai here

iwctwbai commented 2 years ago

Yes, I just happened to see your code and was a little confused @lesismal

lesismal commented 2 years ago

@iwctwbai After I have made this pr, I saw this: https://github.com/olahol/melody/issues/58 , so I think this pr would not be reviewed and merged for a long time. After that I make new commit on my branch and didn't come back to see this pr again and didn't find that it was auto appended to this pr until you leave a message:joy:, thank you for your attention and review!

Hope everything is fine with olahol.

iwctwbai commented 2 years ago

Hope everything is fine with olahol.

z9905080 commented 2 years ago

@lesismal Hello, I have an question about your commit content is not close output channel. it's no problem of send data to no read channel(output)?

my soluation is defer a recover func in writeMessage. however, this is not the best soluation.

lesismal commented 2 years ago

Because writeMessage use select with default, then it won't block:

select {
case s.output <- message:
default:
    s.melody.errorHandler(s, errors.New("session message buffer is full"))
}

The chan will be gc even it is not closed.

So, it's ok.

olahol commented 2 years ago

Thank you for the fix :+1: Sorry for ghosting for so long

lesismal commented 2 years ago

Really happy to see u again, welcome back!

olahol commented 2 years ago

Thank you for the kind words and the help fixing this old library :slightly_smiling_face: