surrealdb / surrealdb.go

SurrealDB SDK for Golang
https://surrealdb.com
Apache License 2.0
231 stars 62 forks source link

panic: repeated read on failed websocket connection #75

Open jimreesman opened 1 year ago

jimreesman commented 1 year ago

Hi this is our first issue in this project. Please correct our style/convention etc as needed.

We consistently encounter panic: repeated read on failed websocket connection from github.com/gorilla/websocket@v1.5.0/conn.go:1030. The following code demonstrates the failure with a MTTF of ~90mins. The attached test is passive, just waiting for an eventual websocket close followed by a read error. With active use of the connection, the MTTF drops considerably (but we don't have a concise reproduction of that here).

Our reproduction environment is:

Note: we noticed a dramatic increase in the MTTF when the platform changed from 3.5.3 to 3.5.8. So, the root cause is likely the proxy closing an inactive socket in the case demonstrated by this code. The problem remains - it seems to us the problem is not detected/reported/recoverable in our application using surrealdb.go.

We'll appreciate any guidance.

steps to reproduce

  1. have a surrealdb instance available: { ns : test, db: test, user : root, password : root } or modify connection details.
  2. initialize a project with the attached application.go.
  3. time SURREALDB_URL=ws://<surrealdbhost>/rpc go run .

application.go.txt

phughk commented 1 year ago

Amazing, thanks for sharing! cc @timpratim Attaching the contents of application.go.txt for convenience as well

package main

import (
    "github.com/gin-gonic/gin"
    "github.com/surrealdb/surrealdb.go"
    "net/http"
    "os"
)

func newdb() *surrealdb.DB {

    //
    dbUrl := os.Getenv("SURREALDB_URL")
    if dbUrl == "" {
        panic("must set SURREALDB_URL. ours looks like \"ws://<host>/rpc\"")
    }
    db, err := surrealdb.New(dbUrl)
    if err != nil {
        panic(err)
    }
    return db
}

func signin(db *surrealdb.DB) {
    _, err := db.Signin(map[string]interface{}{
        "user": "root",
        "pass": "root",
    })
    if err != nil {
        panic(err)
    }
}

func use(db *surrealdb.DB) {
    _, err := db.Use("test", "test")
    if err != nil {
        panic(err)
    }
}

// from db_test.go
type testUser struct {
    surrealdb.Basemodel `table:"test"`
    Username            string `json:"username,omitempty"`
    Password            string `json:"password,omitempty"`
    ID                  string `json:"id,omitempty"`
}

func verifyConnection(db *surrealdb.DB) {
    // just add a record to a table, then delete it

    userData, err := db.Create("users", testUser{
        Username: "johnny",
        Password: "123",
    })
    if err != nil {
        panic(err)
    }

    // unmarshal the data into a user struct
    var user []testUser
    err = surrealdb.Unmarshal(userData, &user)
    if err != nil {
        panic(err)
    }

    // Delete the users
    _, err = db.Delete("users")
    if err != nil {
        panic(err)
    }

}

func main() {

    db := newdb()
    signin(db)
    use(db)
    verifyConnection(db)

    // the simplest possible gin server
    r := gin.Default()
    r.GET("/ping", func(c *gin.Context) {
        c.JSON(http.StatusOK, gin.H{
            "message": "pong",
        })
    })
    r.Run() // listen and serve on 0.0.0.0:8080 (for windows "localhost:8080")
}
jimreesman commented 1 year ago

It seems pretty clear that the root cause is i/o timeout error, which cascades to the websocket panic. The example code creates the db (and therefore the underlying websocket) using defaults. As of v0.2.1 there is the option to specify a timeout e.g. db, err := surrealdb.New(url, surrealdb.WithTimeout(3*time.Second)). This significantly reduces the error rate. We are testing more thoroughly. The underlying problem of the panic in the websocket implementation remains. We also note that post-v0.2.1 there are changes landed that remove the options parameter on .New(), so initialization will be different in the next release.